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1.  Introduction 


This  paper  describes  a  case  study  to  determine  whether  computer-aided  prototyping 

L  niclu^s  Pro%i  e  a  cost-effective  means  for  re-engineering  legacv  software  [14]  The 
case  study  consists  of  developing  an  object-oriented  modular  architecture  for  the  existing 
Janus(A)  system  [6],  and  validating  the  architecture  via  an  executable  prototvpe  usin*  the 
Computer  Aided  Prototype  System  (CAPS )  [  1 0.  1 1  ].  * V 

JanusfA)  is  a  software-based  war  game  that  simulates  ground  battles  between  up  to  six 
adversaries.  It  is  an  interactive,  closed,  stochastic,  ground  combat  simulation  that  features 
precise  color  graphics.  Janus  is  “interactive”  in  that  command  and  control  functions  are 
entered  by  military  analysts  who  decide  what  to  do  in  crucial  situations  during  simulated 
combat.  It  has  gone  through  six  major  revisions  since  1978.  The  current  version  of  Janus 
operates  on  Hewlett  Packard  workstations  and  consists  of  a  large  number  of  FORTRAN 
modules  (1918  FORTRAN  routines.  1 15  C  routines,  and  a  total  of  393K  lines  of  source 
co  e),  organized  as  a  fiat  structure  and  interconnected  with  one  another  via  129  FORTRAN 
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COMMON  blocks,  resulting  in  a  software  structure  that  makes  modification  to  Janus  very 
costly  and  error-prone.  There  is  a  need  to  modernize  the  Janus  software  into  a  maintain¬ 
able  and  evolvable  system  (written  in  C++)  and  to  take  advantage  of  modem  Personal 
Computers  to  make  Janus  more  accessible  to  the  Army.  The  TRACDOC  Analysis  Center 
(TRAC)  initiated  the  HLA  Warrior  project  in  199S  to  re-engineer  Janus  into  an  HLA  compli- 
ant.  PC-based  combat  simulation,  with  improved  graphical  user  interface,  object-oriented 
source  code,  and  a  modem  modular  architecture  [13].  The  Software  Engineering  group 
at  the  Naval  ‘Postgraduate  School  was  tasked  to  extract  the  existing  functionality  through 
reverse  engineering  and  to  produce  an  object-oriented  architecture  that  supports  existing 
and  required  enhancements  to  Janus  functionality.  The  architecture  provides  protocols  for 
communication  between  the  graphical  user  interface  and  the  simulation  models  and  acts  as 
a  blueprint  for  developing  the  C++  code. 

The  paper  is  organized  as  follows.  We  present  the  re-engineering  process  and  the  resultant 
object-oriented  architecture  in  Sections  2  and  3.  Section  4  describes  our  prototyping  exper¬ 
iment.  Section  5  summarizes  the  lessons  learned  and  Section  6  draws  some  conclusions. 


2.  The  Re-engineering  Process 

Software  re-engineering  is  the  process  of  creating  an  abstract  description  of  a  system, 
reasoning  about  a  change  at  a  higher  level  of  abstraction,  and  then  re-implementing  the 
system  [o].  This  section  describes  the  first  two  activities  of  the  re-engineering  process. 


2. 1 .  Reverse-Engineering 

The  first  step  in  reverse-engineering  is  system  understanding,  which  was  aecomplilshed 
via  a  series  of  brief  meetings  with  the  client.  TRAC-Monterey.  We  asked  questions  and 
made  notes  on  the  system's  operation  and  its  current  functionality.  We  paid  particular 
attention  to  the  client's  view  of  the  system  to  gather  their  ideas  on  its  strengths,  weak¬ 
nesses.  and  desired  and  undesired  functionality.  Additionally  we  collected  copies  of  the 
Janus  User's  manual,  the  Janus  Programmer’s  Manual,  the  Janus  Database  Management 
Program  Manual,  the  Janus  Software  Design  Manual,  and  the  Janus  Algorighm  Document 
[6-9.  12].  i 

The  next  step  is  to  abstract  the  system's  functionality  and  then  produce  system  models  I 
that  accurately  represent  that  functionality.  Analysis  of  393K  lines  of  legacy  code  is  a 
daunting  but  inescapable  pan  of  the  process.  We  recoiled  from  the  magnitude  of  this  ! 

effon  in  the  beginning  of  the  project  and  relied  on  information  contained  in  the  Janus  ! 

manuals.  In  hindsight,  it  was  a  mistake  that  slipped  the  schedule  of  the  project  by  serveral  ! 

months.  While  these  documents  helped  us  get  started  because  they  contained  higher  level  | 

information  and  were  much  shorter  than  the  code,  they  were  much  older  and  contained  I 

outdated  information.  We  should  have  started  analyzing  the  source  code  right  away  and 

should  have  persistently  continued  with  this  task  in  parallel  with  all  other  re-engineering  ! 
activities.  c  ; 
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ARCHITECTURAL  RE-EN'GIN'EERIN'G  OF  JANUS 


Figure  I.  Top-level  cummuniealion  structure  of  the  existing  Janus  softwan 


We  divided  the  Janus  source  code  by  directories  amongst  the  team  members  to  explore, 
examine  and  gather  information.  Using  strictly  manual  techniques  and  review  procedures. 
\se  were  able  to  get  a  fairly  good  idea  ot  what  each  subroutine  was  designed  to  do.  We 
also  used  the  Software  Programmers'  Manual  [7]  to  aid  in  understanding  each  subroutine's 
function.  In  doing  so  we  were  able  to  group  the  subroutines  by  functionality  to  get  a 
better  understanding  of  the  major  data  flows  between  programs.  Using  that  knowledge,  we 
developed  functional  models  from  the  data  flows. 

We  used  the  Computer-Aided  Prototyping  System  (CAPS),  an  automated  tool  developed 
at  the  Naval  Postgraduate  School,  to  assist  in  developing  the  abstract  models.  CAPS 
allowed  us  to  rapidly  graph  the  gathered  data  and  transform  it  into  a  more  readable  and 
usable  format.  Additionally,  CAPS  enabled  us  to  develop  our  diagrams  separately,  and  then 
join  them  together  under  the  CAPS  environment,  where  they  can  be  used  to  generate  an 
executable  model  of  the  architecture.  Figure  1  shows  the  resultant  top-level  structure  of  the 
existing  Janus  system.  It  consists  of  five  subsystems — csjdatajngmt.  scenario  jdb,  janus, 
jaaws.  and  postp.  The  cs-datajngmt  subsystem  manages  combat  system  databases.  The 
scenario  jib  subsystem  manages  the  different  scenarios  and  simulation  runs  in  the  system. 
The  janus  subsystem  simulates  the  ground  battles.  The  jaaws  subsystem  allows  analysts 
to  perform  post-simulation  analysis  and  the  postp  subsystem  allows  Janus  users  to  view 
simulation  reports. 
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2.2.  Transformation  of  Functional  Models  to  Object  Models 

derVel°ped  °bjeuCt  models  of  *e  Janus  system,  using  the  aforementioned  materials 

the  mo°r  rfCffi  t0.Create  the  moduIes  and  associations  amongst  them.  This  was  probably 
he  most  difficult  and  most  important  step.  It  required  a  great  deal  of  analysis  and  focus 

eafizah  °mh  SCattered  sets  of  data  and  Unctions  into  small,  coherent  and 

S  ve  hT'  "t  teK  °T  a,,rib",es  •  cn!Ss“J 

b  cause  «e  had  to  ensure  that  the  classes  we  created  accurately  represented  the  functions 

and  procedures  currently  in  the  software.  We  first  identified  a  set  Jf  cSSeSbJSTSj 

created  an  object  model  for  the  core  elements  based  on  the  information  from  the  Database 

nagement  Program  Manual  [8]  and  the  domain  knowledge  of  the  human  experts.  Then 

I!  ana -SZe  the  source  code  and  used  the  information  from  the  Software  Design  Manual 

iuhe  TR  ACMUter  ^  °^tions  *°  ^  °bjeCt  e,asses‘  We  used  HP-UNIX  systems 
and/nrTR  ^  erey  f  'ty '°  ™  the  Janus  simuIati°»  software  to  aid  in  verifying 
and/or  supplementing  the  information  we  obtained  from  reviewing  the  source  code^nd 

documentation.  This  step  enabled  us  to  better  analyze  the  simulation  sv«em .  tlZt 
fflt  ,nt°  ,ts  tunc,,onaI«y  and  further  concentrate  on  module  definition  and  refinement.' 

2  i'  *fZecture°f  thC  °bJeCt  M0MS  °Ud  th€  Devel0P"'e"‘  Object  Oriented 

During  this  phase  of  the  project,  the  re-engineering  team  met  several  times  each  week 

elemem" Idle  a.half  ™onths  t0  discuss  the  object  models  for  the  Janus  core 

elements  and  the  object-oriented  architecture  for  the  Janus  system.  They  presented  the 

eXpei?  at  'eaSt  °nce  per  *»*  «° '^t  feedback  on  the  models 
and  aah.teuurcs  being  constructed.  In  addition,  the  re-engineering  team  also  presented 

S  m  t  m?  "f ""  TT  Pr0jtCL  **  C<,mba,:1 

r  ',  ’a  ,  ;Many  researchers  have  rePoned  that  domain  knowledge  plays  a  critical 

£-£££££*  Pr“'SS  Si"«  «  »«  nc/fimiibar  wkh 

our  pro  ect  oZJt  “"“'“T  **  f°Und  lhat  theSe  meetin~ss  invaluable  to 
,  I  r  •  J  experience  supports  the  ideas  that  competent  engineers  unfamiliar  with 

^'“n '  ""  rola  in  re-engineerin-  as  Wll  aS  in  r«ui« 

ii  easte'r To fiL  “““'“i  of  lnf!e0“  information  about  the  application  domain  makes 
it  easier  to  find  new.  simpler  design  structures  and  architectural  concepts  to  c-uide  the  re 

ngineenng  effort.  Based  on  the  feedback  from  the  domain  expens,  the  re-engineering  team 
re\  ised  the  object  models  for  the  Janus  core  elements  and  developed  a  3-tier  object-oriented 
architecture  tor  the  Janus  system  (Figure  2).  J 


3.  Software  Architecture  for  the  Janus  Combat  Simulation  System 

Central  to  the  existing  Janus  Combat  Simulation  subsystem  is  the  program  RUN  JAN  which 

» 5  e"“:  r„r  sched“ler/or  the  Janus  runmn 

esent  and  executes  that  event.  If  the  next  scheduled  event  is  a  simulation  event. 
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Figure  2.  The  proposed  3-tier  object-oriented  architecture. 


RUNJAN  advances  the  game  clock  to  the  scheduled  time  of  the  event  and  performs  that 
event.  The  existing  event  scheduler  uses  global  arrays  and  matrices  to  maintain  the  at¬ 
tributes  of  the  objects  in  the  simulation.  Hence,  one  of  the  major  tasks  in  designing  an 
object-oriented  architecture  for  the  Janus  Combat  Simulation  subsystem  is  to  distribute  the 
event  handling  functions  to  individual  objects.  Moreover,  it  is  necessary  to  redefine  some 
event  categories  to  eliminate  redundant  coding  of  the  same  or  similar  functions  and  to  take 
advantage  of  dynamic  dispatching  of  event  handling  functions  in  the  object-oriented  archi¬ 
tecture.  Interactions  between  the  simulation  engine  and  the  world  modeler  (the  interface  to 
a  distributed  simulation  network)  are  performed  implicitly  within  the  various  event  handlers 
in  the  existing  Janus.  Such  interactions  are  made  explicit  in  the  new  architecture  in  order 
to  provide  a  uniform  framework  to  update  World  Model  objects  during  the  simulation. 

The  new  architecture  uses  an  explicit  priority  queue  of  event  objects  to  schedule  the 
simulation  events.  Each  event  object  has  an  associated  simulation  object,  which  is  the 
target  of  the  event.  There  are  14  event  groups,  which  correspond  to  the  14  event  subclasses 
shown  in  Figure  3. 

An  object-oriented  approach  enabled  us  to  reduce  the  number  of  event  types  needed  in  the 
simulation.  Depending  on  the  subclass  which  an  event  object  belongs  to,  the  Execute  method 
will  invoke  the  corresponding  event  handler  of  the  associated  simulation  object  to  handle 
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F:^’:o‘c  The  event  class  hierarchy. 


h-d  er^flr  rh  Mmulat.on  object  superclass  defines  the  interface  of  the  event 

ThUPS‘  Kd,PrOVideS  an  “W  body  «  the  default  implementation 
v  are  °VerTidden  b>' the  actual  event  handler  code  at  the 

"  h  ha'e.  non'emPt>  act,ons  assoaated  with  the  events.The  above  architecture 
ea..Jes  a  very  simple  realization  of  the  main  simulation  loop: 

initialization; 

while  not. emptyl  event. queue)  loop 
e  :=  remove  .event(  even  .queue); 
e.executef ); 
end  loop; 
finalization; 

“  £7  T 7*'“  °f  eVe“S'  “«*  ^  extensions 

X  Zo, o S, n T  n  ,h'  basi"nin*  °f  Ihe  b>  ronstntco’  of  new 

-  -  lation  objects,  and  by  the  actions  of  other  event  handlers.  Depending  on  the  actual 

et”rr  .1  *“  an<’  "“I  eV",S  « i”SOT'd  0*  P"4  eve„(^“ 

\VorS  a  i 3l  0W  CVemS  t0  Change  thdr  Priorit'es  while  waiting  in  the  queue. 

V'orld  Model  object  subclasses  (with  names  staning  with  the  "WM”  prefix)  are  created 

simulators  IZed  methods  for  the  world  modeler  to  update  the  objects  from  other 
emulators.  Information  concerning  objects  local  to  the  Janus  simulator  can  be  broadcast 
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Figure  4.  The  simulation  object  class  hierarchy. 
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Figure  5.  Top-level  decomposition  of  the  executable  prototype. 


over  the  simulation  network  either  periodically  by  an  active  world  modeler  object,  or  by 
individual  local  objects  whenever  they  update  their  own  states. 


4.  Development  of  an  Executable  Prototype  Using  CAPS 

In  order  to  validate  the  proposed  architecture  and  to  refine  the  interfaces  of  the  Janus 
subsystems,  we  developed  an  executable  prototype  using  CAPS.  Figure  5  shows  the  top- 
level  structure  of  the  prototype,  which  has  four  subsystems:  Janus,  GUI.  JAAWS  and  the 
POST-PROCESSOR.  Among  these  four  subsystems,  the  Janus  and  the  GUI  subsystems 
(depicted  as  double  circles)  are  made  up  of  sub-modules  shown  in  Figures  6  and  7,  while 
the  JAAWS  and  the  POST-PROCESSOR  subsystems  (depicted  as  single  circles)  are  mapped 
directly  to  objects  in  the  target  language.  After  entering  the  prototype  design  using  CAPS, 
we  used  the  CAPS  execution  support  system  to  generate  the  code  that  interconnects  and 
controls  these  subsystems. 

Due  to  time  and  resource  limitations,  we  only  developed  the  prototype  for  a  very  small 
simulation  run.  which  consists  of  a  single  object  (a  tank)  moving  on  a  two-dimensional 
plane,  three  event  subclasses  (MoveUpdateObj.  DoPlan.  and  EndSimulation).  and  one  kind 
of  post-processing  statistics  (fuel  consumption).  In  addition,  a  simple  user  interface  was 
developed  using  TAE  [15]  (Figure  8).  The  resultant  prototype  has  over  6000  lines  of 
program  source  code  and  contains  enough  features  to  exercise  all  parts  of  the  architecture. 
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Figure  6.  The  JANUS  subsystem  of  the  executable  piwrype. 


The  code  that  handles  the  motion  of  a  generic  simulation  object  was  verv  simple,  but  it 
was  designed  so  that  it  would  work  in  both  two  and  three  dimensions  without  modification 
uurrently  the  initialization  and  the  movement  plan  of  the  tank  object  never  call  for  any 
vertical  motion).  The  code  was  also  designed  to  be  polymorphic,  just  as  was  the  main  event 
loop.  This  means  the  same  code  will  handle  the  motion  of  all  kinds  of  simulation  objects 
vvithout  any  modifications,  including  even  new  types  of  simulation  objects  that  are  part  of 
future  enhancements  to  Janus  and  have  not  yet  been  designed  or  implemented. 


5.  Lessons  Learned 

Our  prototyping  experiment  showed  that  the  proposed  object-oriented  architecture  allows 
design  issues  to  be  localized  and  provides  easy  means  for  future  extensions.  We  started 
out  with  a  prototype  consisting  of  only  two  event  subclasses  (MoveUpdateObj  and  End- 
Simulation)  and  were  able  to  add  a  third  event  subclass  (DoPlan)  to  the  prototype  without 
modifying  the  event  control  loop  of  the  Janus  combat  simulator. 

We  also  demonstrated  the  use  of  inheritance  and  polymorphism  to  efficiently  extend/ 
specialize  the  behavior  of  combat  units.  For  example,  to  implement  the  MoveUpdateObj 
method  of  a  tank  subclass  which  uses  the  general-purpose  method  from  its  superclass  to 
compute  its  distance  traveled  and  a  specialized  algorithm  to  compute  its  fuel  consumption. 
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Figure  7.  The  Gl'I  \utw\stem  of  the  executable  prototype. 


Figure  S.  The  Graphical  User  Interface  of  the  executable  prototype. 
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S2  i!!h1Ude.-1  StTient  t0  inV°ke  the  MoveUPdate°bj  method  of  its  superclass 
ollovved  by  three  lines  of  code  to  update  its  fuel  consumption.  Moreover,  other  combat 
urn  subclasses  can  be  added  easily  to  the  prototype  without  the  need  to  nurfif^S 
5>v.heduhng/dispatchmg  code.  *  cnc 

The  prototype  also  resulted  in  the  following  refinements  to  the  proposed  architecture: 

(1)  Instead  of  a  procedure  with  no  return  value,  change  the  Execute  operation  to  return 
the  time  at  which  the  next  event  is  to  be  scheduled  for  the  same  simulation  object  and 
introduce  a  special  time  value  "NEVER"  to  indicate  that  no  next  event  is  needed.  The 
proposed  change  turns  the  communication  between  the  event  dispatcher  and  the  simu- 
ation  objects  from  a  peer-to-peer  communication  into  a  client-server  communication. 
IS  Change  eliminates  the  dependency  of  simulation  objects  on  details  of  the  event 

queue  and  allows  the  event  dispatcher  to  use  a  single  statement  to  schedule  all  recurring 
events  for  all  event  types.  c 


(_)  Instead  of  recording  the  history  of  a  simulation  run  in  terms  of  sets  of  data  files 
model  the  simulation  history  as  a  sequence  of  events.  The  proposed  change  provides  a 
simple  and  uniform  way  to  handle  history  records  for  all  events,  and  allows  the  same 
mo  ular  architecture  to  be  used  for  real-time  simulations  as  well  as  post-simulation 
analysis.  This  also  provides  the  greatest  possible  resolution  for  the  event  histories 
v.hich  implies  that  any  quantity  that  could  have  been  calculated  durins  the  simulation 
can  also  be  calculated  by  a  post-simulation  analysis  of  the  event  historv.  without  anv 
loss  of  accuracy.  It  also  eliminates  the  need  for  the  WriteStatus  event  in  the  legacy 
so  tuare.  The  only  constraint  imposed  by  this  design  refinement  is  that  the  simulation 
o  jects  associated  with  the  events  must  be  copied  before  being  included  in  the  simulation 
history,  to  protect  them  from  further  changes  of  state  as  the  simulation  proceeds.  This 
constraint  is  easy  to  meet  because  the  process  of  writing  the  contents  of  an  event  object 
to  a  history  file  will  implicitly  make  the  required  copy. 

The  prototyping  effort  also  exposed  a  design  issue— should  null  events  appear  in  the 
event  queue.  A  null  event  is  one  that  does  not  affect  the  state  of  the  simulation,  such  as 
a  ;  loveUpdateObj  event  tor  an  object  that  is  currently  stationary.  The  prototvpe  version 
adopted  the  position  that  such  events  should  not  be  put  in  the  event  queue'  since  this 

corresponds  to  scheduling  policies  in  the  legacy  system,  and  appears  at  first  slance  to 
improve  efficiency. 

Our  experience  with  the  development  of  the  prototype  suggests  that  this  decision  com¬ 
plicates  the  logic  and  may  not  in  fact  improve  efficiency.  In  particular,  the  process  cre¬ 
ate  jiew  .events  could  be  eliminated  from  the  Janus  subsystem  (Figure  6)  if  we  allowed  null 
events.  This  process  scans  all  simulation  objects  once  per  simulation  cycle  to  determine 
it  any  dormant  objects  have  become  active,  and  if  so.  schedules  events  to  handle  their 
nevv  activities.  The  alternative  is  to  have  the  constructor  of  each  kind  of  simulation  object 
schedule  all  of  its  initial  events,  and  to  have  each  event  handler  specify  the  time  of  next 
instance  of  the  same  event  even  if  there  is  nothing  for  it  to  do  currently.  Handlers  might 
still  set  the  time  of  its  next  event  to  NEVER  in  the  case  of  a  catastrophic  kill:  however  this 
is  reasonable  only  if  it  is  impossible  to  repair  or  restore  the  operation  of  the  units  that  have 
suffered  a  catastrophic  kill. 
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The  reasons  why  this  design  change  may  improve  efficiency  in  addition  to  simplifying 
the  code  are  that: 


(T)  the  check  for  whether  a  dormant  object  has  become  active  is  done  less  often — once  per 
activity  of  that  object,  rather  than  once  per  simulation  cvcle. 


(-)  executing  a  null  e\ent  is  very  fast — a  few  instructions  at  most,  so  the  ’‘unnecessarv** 
null  events  will  not  have  much  impact  on  execution  time,  and 

(3)  the  computation  to  find  and  test  all  simulation  objects  periodically  would  be  eliminated. 

Our  recommendation  is  to  allow  null  events  in  the  event  queue,  and  to  explicitly  schedule 
every  kind  of  event  for  every  object  unless  it  is  known  that  there  cannot  be  any  non-empty 
events  of  that  t\  pe  in  any  possible  future  state  of  the  object.  For  example,  under  the  proposed 
scheduling  policy,  immobile  or  irrecoverably  damaged  objects  would  not  need  to  schedule 
future  MoveUpdateObj  events,  but  those  that  are  currently  at  their  planned  positions  would 
need  to  do  so.  because  a  change  of  plan  would  cause  them  to  move  again  in  the  future,  even 
though  they  are  not  currently  moving. 


6.  Conclusion 

Our  experience  in  this  case  study  suggests  that  prototyping  can  be  a  valuable  aid  in  the 
re-engineering  of  legacy  systems,  particularly  in  cases  where  radical  changes  to  system 
conceptualization  and  software  structure  are  needed. 

In  particular,  we  found  that  constructing  even  a  very  thin  skeletal  instance  of  the  proposed 
new  architecture  raised  many  issues  and  enabled  us  to  correct,  complete,  and  optimize  the 
architecture  for  both  simplicity  and  performance.  This  was  done  before  the  architecture 
had  grown  into  a  maze  of  dependent  designs  and  implementation  details.  Consequently, 
the  changes  could  be  realized  without  incurring  the  large  cost  and  time  delays  typically 
encoumed  later  in  the  development. 

The  computer-aided  prototyping  tools  in  the  CAPS  system  enabled  us  to  do  this  with 
a  minimal  amount  of  coding  effort.  The  bulk  of  the  code  was  generated  automatically, 
enabling  us  to  concentrate  on  system  structuring  issues,  to  consider  and  evaluate  various 
alternatives,  and  to  improve  the  design  while  doing  detailed  manual  implementation  for 
only  a  few  pages  of  critical  code. 

The  object  models  produced  in  this  project  have  proven  invaluable  to  the  contractors 
during  code  implementation  phase  of  the  US  Army  TRAC  HLA  Warrior  project  and  will  be 
vital  to  the  National  Simulation  Center  Spectrum  project.  Additionally,  our  efforts  will  also 
benefit  other  simulation  developers.  TRAC-Monterey  sent  the  class  design  to  Combat  21 
(CB21)  developers  at  White  Sands.  CB21  was  able  to  save  time  and  money  bv  reusing  the 
object  models  and  came  up  with  a  design  that  looks  remarkably  like  ours  (although  much 
larger).  The  OneSAF  developers  will  look  at  the  CB21  class  design  and  reuse  as  much  as 
possible. 
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Abstract 

This  paper  addresses  the  need  to  modernize  the  software  of  the  US  Army  Janus(A)  combat 
simulation  system  into  a  maintainable  and  evolvable  structure.  It  describes  the  effective  use  of 
computer-aided  prototyping  techniques  for  re-engineering  the  legacy  software  and  presents  the 
resultant  object  models  and  modular  architecture  for  the  existing  Janus(A)  system.  The  object 
models  produced  in  this  project  have  proven  invaluable  to  the  contractors  during  code 
implementation  phase  of  the  US  Army  TRAC  HLA  Warrior  project  and  beneficial  to  other 
simulation  developers. 

1.  Introduction 

Re-engineering  is  typically  needed  when  a  system  performing  a  valuable  service  must  change,  and 
its  current  implementation  can  no  longer  support  cost-effective  changes.  Legacy  systems  embody 
substantial  institutional  knowledge,  which  include  basic  and  refined  requirements,  design 
decisions,  and  invaluable  advice  and  suggestions  from  domain  users  that  have  been  implemented 
over  the  years.  To  effectively  use  these  assets,  it  is  important  to  employ  a  systematic  strategy  for 
continued  evolution  of  the  current  system  to  meet  the  ever-changing  mission,  technology  and  user 
needs.  However,  knowledge  embedded  in  these  systems  is  difficult  to  recover  after  many  years  of 
operation,  evolution,  and  personnel  change.  These  software  systems  were  originally  written 
twenty  or  more  years  ago  using  what  many  now  view  as  an  archaic  and  ad-hoc  methodology. 
Such  legacy  systems  usually  lack  accurate  documentation,  modular  structure,  and  coherent 
abstractions  that  correspond  to  current  or  projected  requirements.  Past  optimizations  and  design 
changes  have  spread  design  decisions  that  now  must  be  changed  over  large  areas  of  the  code.  Re- 


’  This  research  was  supported  in  part  by  the  U.S.  Army  Research  Office  under  contract  #  35037-MA  and  in  part  by  the  U.  S, 
Army  Training  and  Doctrine  Analysis  Command. 
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engineering  has  frequently  been  nrnvpn  „ 

also  known  to  better  promote  continuous  sofhvSeevoSio^^^6  ^  development  and  is 


ten!f“ of  “  **■«  **» 

performance,  or  evolvability  at  a  lower  cost  0pe^atjon’  system  capability,  functionality, 
improvements  often  take  the fo™  of  IZZseTo^  ^  ^  ^  ^  CUStomer-  Sudb 

configurability,  reusability,  and/or  other  software  ^  d  fhnctl0nahty'  better  maintainability, 
recovering  existing  software  artifacts  from  thp  ngineenng  goals.  This  process  involves 
future  evolution  o°f  the  system  The  ^  35  3  basis  for 

object-oriented  architectures  introduces  certain  comnf  P"°CeduraI  }egacy  software  into  modem 
Smce  typical  legacy  systems  were  not  originally  desXd  Td  the  sofnvare  analysis  process, 
onented  approach,  the  products  of  reve^P  /  •  g  d  d  lmPlemented  using  an  object- 
specifications,  will  probably  reflect  a  ftmct  SUCh  35  re^™ients  or  design 

“transformation”  of  resultant  in^^Sl|baSed  7™^  A$  3  reSult'  some  form  of 
realizable  specification  based  on  the  transfoXf3?  ^  l°  US£  the  speciflcations-  Once  a 
very  difficult  to  quickly  determine  if  the  sneciflr  ject'onented  models  is  obtained,  it  is  often 
requirements.  Since  legacy  systems  are  usuallv  re  /  -0n  *S  /  f™e  rePresentation  of  the  desired 
some  kind  of  improvement,  it  is  unlikely  that  the  °nly™hen  the  existinS  systems  need 

adequately  reflects  current  user  needs  Protm-v  •  Verf  0n  of  the  reconstructed  requirements 

requirements  while  simultaneously  enabling  Drasnip  pr°Vldes  a  means  ,t0  validate  new  system 
proposed  system.  It  is  a  well-established  Lr!  ‘ve/lsers  t0  get  a  bnef  feeI  for  aspects  of  the 
software  quality  [13],  When  used  in  coniunrtfn  **  j30  be  bisbly  effective  in  increasing 
prototyping  can  be  eJZy  us  ^  22  C°ndUCting  3  maj0r  re-engine-ng  effort! 
validation,  risk  reduction,  and  the  refinemenfof  user  requirements!^5  *  ^ 


maimainabL  and^vdvable  structure.  It  des^ibes^h6  °f  ^  Janu*A) . syste™  into  a 

techniques  for  re-engineering  the  ieaacy  software  riVto'd  ^  °f computer-aided  prototyping 
architecture  for  the  Janus  combat  simulation  svstem  rifi]  t  ??  ™  object'onented  modular 
that  simulates  ground  battles  between  up  to  six  adrereaXm  V  SOftWare'baSed  War  game 

stochastic,  ground  combat  simulation  with  color  aranhlT  rT  “  mteractlve’  closed> 

and  control  fimctions  are  entered  bv  militarv  onaT  f  u  _fnUS  1S  mteractlve"  in  that  command 
during  simulated  combat.  The  ^  to  d°  m  C™cial  situado- 

workstation  and  consists  ofa  large  number  of  FORTRAN5  °n  3  Hewlett  Packard 

1 15  C  routines,  and  a  total  of  393  000  line!  m°dules  (1918  FORTRAN  routines, 

organized  as  a  flat  structure  and  interconnected  withnn^  ThC  F0RTRAN  modules  are 

blocks,  resulting  in  a  software  stmctoeX  l^  °ne  another  via  129  FORTRAN  COMMON 
prone.  The  Software  En«ineerin^ro™  a,  to  Janus  very  costly  and  error- 

the  existing  functionalit;  through  11®  .  T  ''  Sch°01  Was  ,asked  to  extract 

archi.ectn.thatsnppoJexisti^^^^-'— 
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2.  Reverse  Engineering 

engineering,  obje^S^Tdesi^Tnd' teT (p^  if*  ”® 


Reverse  Engineering  ’  Qbiect-orienfpH 

'  Resign  domainexpert 

S  feedback 


Design  Validation 
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Figure  I .  The  object-oriented  re-architecturing  process 

d“  tu^  mISs“forlTo'  TS  inC‘UdeS  “*  design 

engineering  effort Tto ^  *"  °f  ~ 

extensible  architecture  and  to  reuse  the  a  •  ^  ^  existmg  system  within  a  modular, 

existing  code  a  void  „d  dt,  C°nCeptS'  ra°de'S  and  al«ori,tas  rather  than  rhe 

the  FORTRAN  implementation  The  h  VV  reclulrements/constraints  that  are  consequences  of 
system  am  fte  u^r  maZh  and  tie  II  P  $  ‘°  “raa  domai”  OT”“P«  t™  the  existing 
written  using  the  iinno  of  the  user  comm »rf  s',stcm  manLia^s  These  manuals  were 

details.  We  found  the  JANUS  Data  Base  M  Y  ^  S  ^  ^  reiatlveIy  free  of  implementation 
because  it  contains  deMed  info™ i„„T„  M.anf|em“‘ Program  Manual  [8]  particulariy  useful 
and  how  they  are **  Meded  ,0  ™del  tte  battle  field 
shown  to  Figure  2  '  8  y)  th'  database  The  t0P-|evel  structure  of  the  database  is 


21 


Symbols 


Janus  Database 

- - j 

Combat 

T 

Systems  j 

|  Systems 

General 

Characteristics 

Functional 

Characteristics 

Volume/Weight 

Detection 

Mine  Vulnerability 
POL 

Weapo  ns/Ord  i  na  nee 
Weapon  Selectioa/ 
Firing  System 
Weapon  Selection/’ 

^  Target  System 
Kill  Categories 
Vulnerability  to 
Indirect  Fire 
Artillery  Systems 
Indirect  Fire 
Lethalities 
Arty  Cloud  Data 
Optical  &  Thermal 
Contrast 
Smoke  Grenade 
Data 

Aircraft  Systems 
Radar  Systems 


Terrain 


Weather 


Weather 


C 


Sensor 


■ - - — J - Characteristics 

Weapons 

General  * 

|  Engineer 

Characteristics 
Round  Guidance 
MOPP  Effects 
PH/PK  Data  Sets 
By  Weapon 
By  Target 


- *  i-'ciuys 

Non-Arty  Smoke 
VEES 
Grenades 
Smoke  Pots 
Large  Area 
Generators 
Minefields 
Dispensing 
Clearing 
Mine  Detection  / 
Duds 

Activation  /  Kill 


Optical/Thermal 

Sensors 

CMR  vs.  Contrast 
Temperature 
On-board  Seekers 
Range  Dependent 
Characteristics 
Capability 
Footprints 
BCIS 

Characteristics 
Flyer  Fuselage/Rotor 
Data  Status 
Rotor  Track  Radii 
Rotor  Acquisition 
Times 

Fuselage  Probability 
Track- 

Fuselage  Radar 
X-section 
Jammer/Radar 
Characteristics 
Jammer  Effectiveness 
Probability  of  Detection 
Data  vs.  Aircraft 


Chemical  / 
Heat  Stress 


Chemical 
Susceptibility 
Chemical  Rounds 
Heat  Stress 


Figure  2.  The  top-level  structure  of  the  Janus  Database 


Not  shown  in  Ficnirp  ?  nrp  *  j 

catego^  affect  dfrectly  or  indireXtedaB  tamhe6''™”  'he  ^  Where  data  en,ered  “  «* 
Of  the  Engineer  Data  depend  on  specific  weather  C.  egones-  For  example,  the  barrier  delays 
system  functional  characteristics  of  the  System  DatT  7°!!  Specified  by  the  Weather  Data  and 
interdependencies  is  highly  co  The  overall  network  of 

of  the  functional  model  of  the  existing  Janus  software.  d  d  throuSh  construction  and  analysis 

recoiled  fromhf  nagnitudeof  thTs  effort  fn  thtb^  but  JnescaPable  P**  of  the  process  We 
contained  in  the  Janus  manuals.  In  hTnds!"hf  it HT**  ^ f  ^  and  relied  °° ^tion 
project  by  several  months.  While  these  documents  heLT^  that  Shpped  the  scbedule  of  the 
higher  level  information  and  were  much ”  2 1“  ST  ‘ 1  because  they  contained 

snorter  than  the  code,  they  were  much  older  and 
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mental  digestion,  °v  ^  COmplexity  recluires  time  for 

of  the  code  right  away  and  should  have  persistentlvcnT^  dg't,We  should  have  started  analysis 
re-engineering  activities.  Cross-fertilization  hetw  Y  n  mued  thls  task  in  parallel  with  all  other 

some  dead-end  directions  earlier  and  would  hTve  w  W°U,d  haV£  hdped  us  recognize 
effectively.  nd  WOuld  have  en*bled  us  to  spend  meeting  time  more 

Usrng  manual  techniques  augmented  with  simole  timfy  „ u  n 

through  the  code  and  get  a  fairly  <mod  idea  ofwhn^  ^  ^  commands>  we  were  able  to  walk 
used  the  Software  Programmers'  Manual  [61  to  ^  S“br0Utine  was  designed  to  do.  We  also 
In  doing  so  we  were  able  to  ^  Subr0utine’s  Action, 

of  the  major  data  flows  between  programs  and  develnn  fi  nCtl0na  lty  t0  gef  a  better  understanding 
used  the  Computer  Aided  PrototypL  System  fCAPsf  T  ^  data  flows‘  We 
Postgraduate  School,  to  assist  in  developing  the  3  f  f arch  toc>1  developed  at  the  Naval 

graph  the  gathered  data  and  transform  it  into  Ct  ™.ode  s  f  ■ I2J-  CAPS  allowed  us  to  rapidly 

CAPS  enabled  ns  co^^ZJL "1 '“7  ’“‘‘"’l'  f d  ',saWe  Additionally! 

CAPS  environment,  where  they  can  be  used  ,o  general  ^^1^7  '0ge‘her  "* 

asking  questions  and 

client's  view  of  the  system  to  ZherZeki detT'f  ft,nCtionality-  We  P^d  attention  to  the 
undesired  functionality.  These  meetings  were  indk  "  ltS,  ,Str®ngths’  weaknesses,  and  desired  and 
was  not  present  in  the  code  xTieZltfTT  **  gaVe  US  information  ** 

simulation,  we  were  usins  these  meetings  to  defe™™  T  W‘th  d°mam  of  ground  combat 
playing  the  role  of  "smart'' ignoramuses"  Til  Domain  T  r^qui[ements  of  this  domain,  often 
technique  for  software  re-engineerin*  ri5l’  oTT  yS,S  ^  ^  ,dentified  as  an  effective 
unfamiliar  with  the  application  domSj  have  an  eS'rolelf'5  **“■ 
requirements  elicitation  because  lack  of  inessential  T ?  re-engineering  as  well  as  in 

makes  it  easier  to  find  new,  simpler  design  stmctures  about  the  application  domain 

engineering  effort.  °  toe  and  architectural  concepts  to  guide  the  re- 

3.  Object-Oriented  Design 

—  td'Su^" i V* W  *** 

modeling  is  needed  to  support  e^Mtiv^re^enttilfeerinu  ^ssocl,dons  *»ongst  rhem.  Information 
the  most  difficult  and  most  important  phase  ft  §-  complex  systems  f4]-  This  was  probably 
transform  the  currently  scattered  sets  of  data  ancTfb  ^t'  3  dea*  °f  analysis  and  focus  to 
objects,  each  with  its  own  attributes  and  onerar  T*  T  Sma11’  C°herent  30(1  reaIizab^ 
knowledge  of  object-oriented  analysis  and  ^onl  ed^  In  P7_formmg  this  Phase^  we  used  our 
notations  to  create  the  classes  and  associated  ^MT  techniclues  [17]  and  the  UML 

phase  because  we  had  to  ensure  that  the  classes  ^  *1  0perations  HSJ.  This  was  a  crucial 

and  procedures  currently  in  the  software.  WG  Created  accurately  represented  the  functions 
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Restructuring  software  to  identify  Hat*  u  * 

Transformations  for  meaning-preserving  resti-T  ^tractl0ns  is  a  difficult  part  of  the  process 
We  used  the  HP-UNIX  systems  at  the  TRAC  mLT  t°°1  SUpport  is  available  [5].' 

gaming  insight  into  its  functionality  and  further  modtile^defir^tkjn'and  refineme^01' 

The  re-engineering  team  met  several  times  each  w#^i,  r 

discuss  the  object  models  for  the  Janus  core  dataelementsallT0^  ‘W°  and  a  ha,f  months  to 
he  Janus  System.  We  presented  the  findings  to  the  w  T  objec,-°n“toi  architecture  for 
°  get  feedback  on  the  models  and  architectures  heir. o-  omain  experts  at  least  once  per  week 

aam  a,so  presented  the  findings  to  h  ■***«.  the  te-engineering 

e  ational  Simulation  Center  project  We  found  that  ‘  f  ProJect'  the  Combat21  project,  and 
essential  for  understanding  the system DaSl  , 11  mformatlon  from  I these  domain  experts  was 
correspond  to  stakeholde^  needs"  ^  “  the  le^  did  not 

involvement  of  domain  experts  is  critical  for  nontrivialre-en^nee^ig  task^1*^  *  M  ,hat 


Figure  3.  The  proposed  3-tier  object-oriented  architecture 
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Bnthy  ^meiJ  °f  th.e  stakeholders  m  the  simulation  community  also  pays  off  in  the  long  run. 
Both  the  National  Simulation  Center  and  Combat21  projects  were  able  to  save  time  and  money  by 

1  "  X  W°rk  c*me.  up  w‘th  desi§ns  that  look  remarkably  like  ours  (although  much 
larger).  Now  OneSAF  developers  have  been  directed  to  look  at  the  Combat21  class  design  and 
reuse  as  much  as  possible.  So,  our  efforts  have  directly  benefited  other  simulation  developers. 

Based  on  the  feedback  from  the  domain  experts,  the  re-engineering  team  revised  the  object 
models  for  the  Janus  core  elements  and  developed  a  3-tier  object-oriented  architecture  for  the 

WC  "XtraCted  m0St  of  the  data  and  operations  from  the  existing  Combat 
bystem  DBMS,  Scenario  Management,  Janus  Combat  Simulation,  JAAWS  and  POSTP 
subsystems  and  encapsulated  them  as  simulation  objects  in  the  Core  Elements  packase,  leaving 
only  application  specific  control  codes  that  use  the  simulation  objects  in  each  of  “these  five 

demPntTn  ^S,UreSf4,and  5  show top  level  class  structures  of  the  object  models  of  the  core 
lements.  Details  of  the  associated  attributes  and  operations  can  be  found  in  [2,  20]  and  are 
omitted  from  these  diagrams  due  to  space  limitations. 


Figure  4.  The  top-level  structure  of  the  Janus  Core  Elements  Object  Model 
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event  scheduler  for  the  simulation  RUNJAN  det  RUNJAN'  which  is  lh=  ™ 

that  event.  If  the  next  scheduled  evem^!  I™'11'5  the  sched“Ied  «™»  and  execut 
clock  to  the  scheduled  RUN,AN  wffl  advanK  «*  *» 

System  uses  17  different  categories  to  charart^  ■  °T  that  eVenL  The  existln§  Janus  Simulatic 
events  using  the  following  event  handlers:  6  eVentS'  RUNJAN  then  handles  these  1 

\l  P°Pkn  *  Interactive  Command  and  Control  activities 
2.)  Movement  -  Update  unit  positions 

2  P°P,°Ud  'nCreate  and  uPdate  sm°ke  and  dust  clouds 
4  StateWt  -  Periodic  activity  to  write  unit  status  to  disk 
5)  Reload  -  Plan  and  execute  the  direct  fire  events 
o)  Intact  -  Update  the  graphics  displays 

7)  CntrBat  -  Detect  artillery  fire 

8)  S  SSS  Ch0“e  and 

10)  Rrin^iuat  Tlf'  Cl°Ud?  and  “  different  chemical  sla.es 

)Radar  -  Update  an  a,r  defense  radar  state  and  schedule  a  direct  Are  evem  for  "normal” 

1  j)  Copter  -  Update  a  helicopter  states 


14)  DoArty  -  Schedule  an  indirect  fire  mission 

15)  DoHeat  -  Update  units’  heat  status 

16)  DoCkpt  -  Activity  to  perform  automatic  checkpoints 

17)  End  Jan  -  Housekeeping  activity  to  end  the  simulation 

The  legacy  event  scheduler  uses  global  arrays  and  matrices  to  maintain  the  attributes  of  the 
objects  in  the  simulation.  Hence,  one  of  the  major  tasks  in  designing  an  object-oriented 
architecture  for  the  Janus  Combat  Simulation  Subsystem  was  to  distribute  the  event  handling 
functions  to  individual  objects.  However,  many  of  the  current  event  handler  categories  contained 
redundant  code  and  did  not  seem  to  be  very  coherent  with  respect  to  the  class  hierarchy  we 
created.  For  example,  the  set  of  event  handlers  used  to  simulate  the  activities  of  a  particular  unit 
to  search  for  targets,  select  weapons,  prepare  for  a  direct  fire  engagement,  and  then  execute  that 
direct  fire  engagement  differs  depending  upon  whether  the  unit  has  a  normal  radar,  special  radar, 
or  no  radar  at  all.  The  legacy  Janus  Simulation  System  uses  the  Radar  event  handler  to  carry  out 
the  entire  procedure  if  the  unit  has  normal  radar.  However,  it  uses  the  Search,  Radar,  and  Reload 
event  handlers  to  carry  out  the  procedure  if  the  unit  has  special  radar.  Finally  the  system  uses  the 
Search  and  Reload  event  handlers  to  conduct  the  procedure  if  the  unit  has  no  radar  at  all.  We 
conjecture  that  this  lack  of  uniformity  is  due  to  a  series  of  software  modifications  made  by 
different  people  at  different  times  without  full  knowledge  of  the  software  structure. 

It  was  necessary  to  redefine  some  event  categories  in  order  to  reduce  interdependencies  between 
the  event  handlers,  to  factor  simulation  behavior  into  more  coherent  modules,  to  eliminate 
redundant  coding  of  the  same  or  similar  functions  and  to  take  advantage  of  dynamic  dispatching 
of  event  handling  functions  in  the  object-oriented  architecture.  Moreover,  the  Janus  system  was 
originally  designed  to  work  in  isolation,  and  has  since  been  adapted  to  interact  with  other 
simulation  systems.  Interactions  between  the  simulation  engine  and  the  world  modeler  (the 
distributed  simulation  network)  are  performed  implicitly  within  the  various  event  handlers  in  the 
existing  Janus.  Such  interactions  are  made  explicit  in  the  new  architecture  in  order  to  provide  a 
uniform  framework  to  update  World  Model  objects  during  the  simulation. 

The  new  architecture  uses  an  explicit  priority  queue  of  event  objects  to  schedule  the  simulation 
events.  We  were  able  to  reduce  the  total  number  of  event  handlers  needed  in  the  simulation,  from 
17  to  14,  by  eliminating  identified  redundant  code  (Figure  6).  The  14  remaining  event  handlers 
are  as  follows: 

1)  DoPlan  -  Interactive  Command  and  Control  activities 

2)  MoveUpdateObj  —  Moves  and  update  the  objects  in  the  simulation 

3)  Search  —  Searches  for  potential  targets  based  on  the  detection  devices  available  to  the 
objects 

4)  ChooseDirectFireTargets  —  Once  search  is  complete  chooses  best  target  to  engage.  In 
future  simulations,  implementations  may  allow  users  to  choose  targets 

5)  CounterBattery  -  Simulates  counter  battery  radar  to  find  potential  targets 

6)  DoDirectFire  —  Executes  direct  fire  events  and  updates  ammunition  status 

7)  DoIndirectFire  -  Executes  indirect  fire  events  and  updates  ammunition  status 

8)  ImpactEffects  -  Calculates  results  of  round  impacting 


5.  Conclusions 


tr  h“  t;  rr  f for  ^ « *. 

much  of  the  information  needed  to  do  a  <*ood  ioh  ,-f  ",  °  lrnPortant  Part  of  the  process  because 

at- 

-  **-*.  of  ,he  „  system 
knowledge  of  these  deficiencies  is  crudal  for  succest  ™P  e™entatl0n-  Thorough  and  accurate 
system  to  have  the  exactly  same  behavior  as  the  h*onSS  £  C  ie^tS  never  want  the  re-engineered 
be  little  motivation  to  spend  time  effort  and  ret-V  SyStem  '  lf  they  were  satisfied,  there  would 
system  is  being  re-engineered  ™  *  re'^ineering  Project.  Even  if  a 

behavior  at  the  interface  to  the  hardware  and  systems  softwa^wlh  bfdTfferenf^  **“  ***”* 

the^l^acyasystem!^eytpartrof1therequirem«usrf"eith'neere^  “*  different  fr0m  those  for 

on  the  legacy  documents.  Some  of  TV  ^  °ften  mi$sing  or  incorrect 

often  fragmented  and  scattered  across  members  nf°n  1S  VrV  0ldy  tbe  m*nds  op  the  clients, 
is  a  large  part  of  the  process,  and  that  communicative7  '  T"1  organizations-  Communication 
be  enhanced  by  appropriate  use  of  prototyping.  3111101  °  automated  avva-v*  although  it  can 

development  of  new  systems  WeiSt'Vl'V  in  both  re-en§meering  and  the 
role  in  re-engineering  efforts.  Our  experience  VpportsThat  **  ****  ^ 

correcting,  and  refining^V  conceptual  strucmresV  S“bs!iantially  t0  the  Pr°cess  of  inventing, 
be  based.  Most  legacy  systems  are  too  comDlicnteH  7  1C  J-  e  architecture  of  the  new  system  will 
constructing  even  a  very  thin  skeletal  instance  nf  V  m  1Vldua  s  t0  understand.  We  found  that 
issues  and  enabled  us  to  co^ecrcomplete  and  Vk  P?p0Sed  lecture  raised  many 
performance.  (See  [3]  for  lessons  leameH  f  Ptimize  the  architecture  for  both  simplicity  and 

architecture  had  grown  t0  a ^  5"d «  7^  ^  done  Wtothe 

Consequently,  the  changes  could  be  realized  without  deS-gnS  T?  imPIementation  details, 
typically  encounted  later  in  the  development.  h  mcumng  the  lar§e  cost  and  time  delays 

TheUM? modified  raPidly-  accurately,  and  cheaply. 

executable  prototype.  Such  weakness  can  be  rem  !r  SUPport  automatIC  code  generation  for  the 
PSDL  [10,  111  and  the  CAPS  nrntnh  •  •  emedied  hy  the  use  of  the  prototype  language 

the  system’s  dynamic  behavV  in  a  Vm  TaTVV  WhlCih  provide  effective  means  to  model 
demonstration.  ^  Can  be  easily  validated  by  user  via  prototype 
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* — P„gram 

Of  the  programs  that  could  possibly  be  eeneratLff  P  I™  OIKe’ t0  establish  a  reliability  property  for  all 
be  checked  via  computable"  Wc  f°CUS  her£ 

exception  closure  of  the  generated  code^re  ^5°*,  *  aSSUre  Syntactic  ^rectness  Z 

annot  raise  any  exceptions  other  than  those  declared  in  its  interface  m£a"S  that  3  Softvvare  module 

JL  Introdurfinn 

*SrUu  vy^mP™v^S)I^81<^edinolo^Cfo^  amom'attc  ZoT^  SOftwarc-  We  are  dressing  the 

reliability  issues.  oY  3Ut0matIC  software  generation,  with  particular  attention  to 

We  take  a  domain  specific  view  of  this  process- a  Hr,  •  •  r 

a  common  set  of  issues.  A  domain  analysis  identifies  “  3  ^  y  °f : re'ated  probIems  dressing 

and  determines  a  corresponding  set  of  solution  methods  Use^fTi?  ‘SSUeS’  formuIates  a  modeI  of  these, 
generation  system  describe  their  particular  problenfusin^  rin  ^  proposed  computer-aided  software 
that  provides  concrete  representations  of  problems  Tn  the d 03111  spec'fc  Preblem  modeling  language 
determines  which  solution  methods  are  applicable  «,Tt  h  don^ain-  The  s>'stem  then  automatically 

domains,  and  that  can  generate ^dlTn' a'nj^'^'f  f  ?' can  be  aPPlitd  »  Afferent  problem 
effecve  methods  for  genmri^  l,ne^  The^fdre  we  seek  uniform  and 

problem  modeling  language,  the  target  proorammino  lanonT*  desdrlbed  above-  given  definitions  of  the 
programs.  A  simple  architecture  for  this  precis  S™T„  F^trif  r°"S  syn"’esi2i"-t 

static  rules  in  this  language.  We  address  the  nroblemc  Yf  ™  CS  and  (2)  t0  Provide  examples  of 
generated  from  a  given  set  of  rules:  (i)  are  syntactically  com*  th3t  a"  pr0grams  which  can  be 

than  those  explicitly  specified  in  an  interface  descripfion.  ^  (2)  Wi"  n0t  raise  any  excePdons  other 

This  is  a  step  towards  a  coordinated  system  of  a  a 

program  synthesis  rales.  On,  hypothesis  is  that  the  most  ^effect  y“""C  Ch"ks’  “  be  P'rf»™«d  on 
to 1  systemattcally  improve  and  certify  the  rules  us™d  m  «n ^  "”pr°ve  sof,ware  <n“«V  is 
This  approach  directly  addresses  the  issue  of  correct! v  imnlp  -  3  d°ma,n'Speciflc  software  generator, 
indirectly  addresses  the  issue  of  getting  the  rtyht  requireminlTK1^  S‘Ven  Software  requirements.  It  also 
prototyping  of  product  quality  systems'by  problem  2^  3US£  “  Sh°Uld  eventual‘y  enable  rapid 

the  requirements  are  found  Wh°  "eed  not  be  *>«*«  experts  If 

and  regenerate  a  new  version  of  the  solution  software.  W,U  S,mpIy  update  the  ProbIem  models 

effectiveness  ^  ^  th£  Cla™  °f  cost 

future  applications  of  the  generator!  -  by  regenerating  *  £mplates  can  be  extended  to  all  past  and 
then  regenerating  the  past  applications.  The  regeneration  g£nerator  l's,ng  the  improved  templates  and 
reducing  labor  costs,  eliminating  a  source  ofVandom  h  ?  ^  b£  COmpletely  automated,  thereby 

repairing  a  known  fault  throughout  a  large  family  of  sofhvLXstems0'5’  ^  SP£edmS  Up  the  process  of 
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The  relation  to  the  theme  of  this  workshoo  is  that  fac 

automatically  generating  new  variants  of  the  software  that  SCenanos  can  be  addressed  by 

Our  approach  should  reduce  the  explicit  qualitv  *  ™  reflec'chan-in2  'ssues  in  the  problem  domain, 
changed.  By  amortizing  the  quality  "eeded  each  time  the  soft ware  is 

same  templates,  we  can  reduce  quality  assurance  cost^The  °  th^  template  over  many  applications  of  the 
generated  from  the  same  templates.  ^  The  beneflts  increase  with  the  number  of  systems 


andgL  ^eTampk  ^  SMte,"y  Cer,if>in=’  ‘»ree.ness  trended  coda. 

•  Section  4  does  the  same  for  analysis  of  exceptions. 

Section  5  contains  comparisons  to  previous  work 

•  Section  6  presents  conclusions. 

Zi _ Template  Language 

languagc.^wTcreate^uclflan^uage^based^na'^funcfo6  JS°f't.Ware  Synthesis  Patterns  for  a  given  target 
take  a  functional  (i.e.  side-effect-freef  annmarh  h  10na*  oblect  model  of  code  generation  templates.  We 
and  supports  affective  static  analysis  methods  suchStepSSS  °f ' 8ppr°aCh 


Because  many  different  programm^n”™^^^  target  Programming  languages. 

SnTpig^r  2,1  °f  theSe  -  «  once 

*  ge^o  parameters0'  itT  the^jmerilTun^  t0  Pr°VidinS  Substitution  of 

other  templates.0 Retmrsbnls^inchlded  21  "*  “  C°de 

Templatejanguage  =  {template,  formal_def,  template_expression} 

"  ™P^E<'d 5'w- temp]ate_expression): 

DEF_FORMAL(ternp|ate_parameter,  type):  formal_def 
res  the  type  of  a  formal  parameter 

ieF?PlatVParameter<  {id[any^’  temP^e  expression} 

APPLY(id[tempiate],  seq[tempIate_expressionJ):  template_expression 
tempiate_expression  <  target_language 

Figure  2.  Template  Abstract  Syntax 

The  construction  depends  heavilv  rm  tu*  ~  • 
programming  languages.  The  situation  is'.Hus,  raid  “^3.’"'“”°'  “  ‘>b-iec'-°ri'"«d  modeling  of 


T pmiil'iln  n  1  r 


Figure  3.  Generic  Template  Language 
In  object-oriented  modeling  ciass-wiHp  n/r»^c-2 

time  we  add  a  subclass  with  a’ new  constructor  we  add'm  “  ^  Md  CXtensible-  Specifically,  each 
extending  its  value  set.  ’  e  add  more  •"stances  to  the  class-wide  type,  thus 

properly  ormslnjcte^abspac^syntax^ther^should5 b^one  3  T  “"d  c  entity.  a 

conarmcor  of  these  t*,es  cotresponi  ,„7 ^production  of  ,ha“f  We  ^  “Ch  "‘”M“"i"al  W  Each 
-  ,  specify  that  every  instance  of  the  subclass  is  ai™  i  -  grammar-  Subclass  relationships,  denoted  by 
.S  allowed.  For  example,  in  line  6  of  Fig„tl  a«  to  ef““  °f,,le  ^  ^  W-maJ 

and  also  is  a  kind  of  template  expression  This  k M  o?«,hTP  ,  pa“”'"r  is  a  kind  of identifier, 
reoaab  e  types  in  a  libraty  of  programming  angnL baildi„! h  ,  'e?  iS  “Sed  »  "corporate 

mosabie  concepts  to  the  application,  such'  as  “  ,den,iC'rs-  -  » 

T<s  means  T  ,s  a  subclass  of  each  eiemen,  of  S.  SmpmSs  $  ° 

all  its  subclasses.  trTsWveiy  ^  mS“"C'S  °ra  dass  "’de  W«  include  its  direct  instances  and  those  of 


assmg  is  also  used  to  interface  between  a  target  programming  language  and  its  extensions.  In 
Figure  2,  target-language  denotes  the  set  of  types  comprising  the  abstract  syntax  of  the  target  language. 
Figure  4  shows  a  very  simple  example  of  a  target  language  that  illustrates  how  this  works. 

targetjanguage  =  (stmt,  exp) 


assign(var,  exp):  stmt 
if(exp,  stmt,  stmt):  stmt 


integer  <  exp  « integer  literals 

var  <  {id[any],  exp}  —  program  variables 

apply(id[function],  seq[expj):  exp  -  operations 

subtype  rule:  x  <  y  =>  id[xj  <  idly]  where  x,  y  e  type 

Figure  4.  Example:  Micro  Target  Language 


The  example  m  Figure  5  defines  a  code  generation  pattern  that  embodies  Newton’s  method  for 
polynomial  evaluation,  which  is  optimal  in  terms  of  number  of  evaluation  steps  needed.  This  is  a  very 
simple  ei«imple  of  a  code  generation  pattern  that  is  nevertheless  realistic,  because  it  embodies  a  solution 
method.  The  example  also  illustrates  the  use  of  all  the  constructs  in  the  template  language.  We  use  infix 
syntax  for  the  exp  constructors  *  and  +  to  improve  legibility  (e.g.  x*y  is  short  for  the  term  apply(*,  x,  y)). 

An  additional  benefit  of  considering  the  abstract  syntax  to  be  an  algebra  rather  than  a  tree  is  that  we 
can  used  well-studied  transformation  rules.  In  particular  we  can  associate  equational  axioms  with  the 
programming  language  types  that  define  normal  forms.  Figure  5  illustrates  the  use  of  such  axioms  as 
rewrite  rules  that  simplify  the  code  produced  by  the  generator  in  a  follow-on  normalization  process.  This 
is  one  way  to  incorporate  optimizations  into  the  program  generation  process,  which  is  useful  for 
unconditional  transformations. 

TEMPLATE  evaluate_polynomiaI  (v:  var,  c:  seqfinteger]):  exp 
--  c  contains  coefficients  of  a  polynomial,  lowest  degree  first 
IF  not  (is_empty  (c) )  —  use  operations  of  boolean  and  seq 
THEN  v  *  (evaIuate_po!ynomial  (v,  rest(c)))  +  first  (c) 

ELSE  0 

END  TEMPLATE 


Template  application  evaIuate-polvnomia!(x,  (1, 2, 3])  generates 
X  *  (x  *  (X  *  0  +  3)  -r  2)  +  1 

Normalization  with  integer  rules  i  *  0  =  0,  i  +  0  =  i  reduces  to 
x  *  (x  *  3  +  2)  +  1 


Figure  5.  Example:  Generation  Pattern 


Code  generation  using  the  template  language  is  a  very  much  like  evaluation  in  a  functional 
programming  language  with  call-by-value  semantics.  Analysis  of  templates  can  take  advantage  of 
equational  reasoning,  substitution,  and  structural  induction.  The  limitation  to  primitive  recursion 
facilitates  the  latter.  The  recursion  in  the  example  is  structural  because  rest  is  a  partial  inverse  for  the 
sequence  constructor  add  (i.e.  rest(add(x,  s))  =  s). 

3-  Syntactic  Correctness  of  Generated  Code 

We  treat  the  abstract  syntax  structures  of  the  target  language  as  the  values  of  the  abstract  data  types 
representing  the  programming  language.  We  require  these  types  to  provide  a  pretty  printing  operation  that 
outputs  such  objects  as  text  strings  according  to  the  concrete  syntax  of  the  target  language,  with  a 
readable  format.  Establishing  correctness  of  these  pretty  printing  operations  is  straightforward,  and  in  fact 
their  implementations  can  be  generated  from  an  appropriately  annotated  grammar  for  the  concrete  syntax. 

Given  trusted  pretty  printing  operations  for  the  object  model  of  the  target  language,  syntactic  ^ 
correctness  of  the  output  reduces  to  the  type-correctness  of  the  ground  terms  generated  by  the  evaluation 


Of  the  templates.  This  can  be  checked  un¬ 
conventional  type  checking  methods.  Note  that  le  tyP£  $yStem  f0r  the  temP'ate  language  and 

of  the  constructors  in  the  object  model  of  th TZe  n™™8*  f  ^  associated  with  the  signatures 
FiMre' 56"irOSran’m'”8  la”guage,  which  may  no,  even  btTworf1'!’8”5'-  ,han  Ihe  W|thin 

-c.  The  intone  of, he  w 

evaluate_polynomial  '  Var' 

^  :  var, 

rest(c:  seq[integer] ) ;  seqfintegerj)  ;  exp 

first  (c:  seqfinteger] )  >  i^gsr 

"“term  form  ofv*  Gvalnjjtp  .  ,  /  )  *  @xp 

ELSE  0  aiuate_polvnomial  (v,  rest(c))  +  first  (c) 

END  TEMPLATE  integer 

Types  conform  because  integer  <  <  eJtp 

Relevant  signatures:  +<exp.  exp)  :exp,  .(exp,  exp)  :exp. 

first(seq[T]):  T,  restfseq[T]):  seq[T], 
is_empn-(seq[T]):  boolean,  not(bool’ean):  boolean 

Figure  6.  Example:  Syntactic  Correctness  of  Generated  Code 

This  is  sufficient  to  esmblis'h^rtial  type  coSwfof  f  **  ^  CheCk'in§  calculatio11- 

all  code  that  could  be  generated  by  the  template  it  does  ates’  which  lmP!les  syntactic  correctness 
still  have  the  possibility  that  cv’aluaficm  of  Ae^^p^ate'm'glrtfejl  to  temfinate  COrrectness’ 

—  ‘hat  a„  recursions  are  primitive.  The 

IT/  m  (T,S))  =  S-  ThiS  means  that  the  induction  is  tn  fact  °^he^mPound  sequence  constructor; 
is  total.  Thus  the  template  will  produce  syntactically  corJT  ^  hence  that  evaIuate-Polynomial 
type  signature  of  evaIuate_polynomial.  '  Code  for  a11  lnput  values  that  conform  to  the 

the  corresponding  partial  inverse^peifions^ifis  stnKf  COnStructors  that  define  the  abstract  syntax  and 
calls  are  primitive  with  respect  to  any  c-iven’oarameTer0™ ‘ automaticaIIy  check  that  all  recursive 
be  applied  uniformly  and  completely automaticalTv  n  l  Un,plies  that  StructuraI  induction  can 

hat  structural  recursions  are  sufficient  to  define  the  IlteXt'  Furtherrnore’  our  experience  suggests 

-,,e  *sig„em  enn  hv.  wi,hi„  ,he  -  - 


— — Exception  Closures  fhr  Generated  rnHo 

One  common  source  of  software  failure  is  unhanHUvt 
certifying  that  all  programs  generated  ?hiS  SeCti°n  CXplainS  3  method  for 

when  placed  in  a  context  that  handles  a  specified  set  of  exceptions"1101  a"y  unhandled  exceptions 

evaluafion^of  any1' ^^0^0^^^^  of  exceptions  that  might  be  raised  by  the 

uxccpcous  tlm  might  hu  mixed  by  execu'dou  -  - 
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exprs^fexceptiriT  The  intend  ^  ***?  ***  exp  with  a  Parameterized  family  of  types 

pro.iucib.e,  this  farniiy  of  types  has  a  rich  sobcL  ' 

SlcS2  =>  exp[Sl]  <:  exp[S2] 

raise^TceprionT'an0/ a“  ' mSdedT  T'^  “P'id',y  f°r  'Xpr'S!ion  W*  «nn« 

w“  y  ",e  fonowi”g  n* which  des"ib“ ,he 

F(exp[0]) :  exp[Sl]  =>  f(exp[S2]):  exp[Sl  u  S2] 

repret\^ 

(TRY  exp[Sl]  CATCH  e  USE  exp[S2]):  exp[(Sl-{e})  u  S2], 


Figure  7  shows  the  exception  analysis  for 
Figure  6  are  underlined 


our  running  example.  The  parts  added  to  the  version  ir 


TEMPLATE;  evaluate_polynomial  (v:  var,  c:  seq[integer]):  exp  [  (ovf  1 }  ] 

1  Sfc?  (C :  se<3  (integer  [boolean  boolean 
THEN  +(*(v :  var 

evaluate_polynomiaI(v:  var 

rest(c:  seq  [integer y>ep[integef  }.  exp  [{ovfl }] 
first  (C:  seq  [integer ]integ)er  expf  (ovf  1 )  ] 

“  forni  of  v  *  evaluate_poIynomial  (v,  rest(c))  +  first  (c) 

ELSEO:  integer 
END  TEMPLATE 

Types  conform  because  integer  <  exp[0]  <  exp  [  f  ovf  1 }  ]  and 
var < exp  [0]  <  exp  [{ovfl}] 

Relevant  signatures:  +(exp,  exp):  exp  [  (ovf  1 )  ]  ,  *(exp,  exp):  exp  [(ovfl)] 
first(seq[T]):  T,  rest(seq[T]):  seq[T],  is_empty(seq[T]):  boolean, ,  not(boolean):  boolean 


Figure  7.  Exception  Closure  of  Generated  Code 


Note  that  we  require  the  author  of  the  template  to  specify  in  the  type  declaration  of  a  template  the  set 

PvnJhC  analysis  sh°w;n  in  the  fiSure  establishes  a  partial  exception  closure:  it  guarantees  that  all 
ovKT  by  ‘he  “mpla,e  Ca"  “  m°S'  rai“  the.*«p«io„  .vn  t£3££^ 

st  prooram^encrarin'n  rfmp2  Th"1  haVe  “  ad<iress  dean  “mination  of  the  template  expans.cn 

fhprP  w  U  K2  !-  The  pnmit,ve  recursion  check  explained  in  the  previous  section  guarantees 

mu  ll  h  T  'nfimte,reCUrsi0nSa  50  that  Nation  is  guaranteed.  However,  for  clean  ZSon we 
must  also  check  that  evaluate  of.be  template  win  „ise  any  exceptions  a,  pmgram  g  JS2 

abs,mc,,™ax,h'  Z^ZT*  ?  p  T'™  '“'|,'io"s'  W"“  ™»=d  aa  constractors  of  the 
areevaluatecLnoI when °"*W  «* »>“"  expressions 


The  sequence  operators  first  and  rest  are  A .u 
syntax,  not  total  constructors.  If  applied  to  an  emotv  '  hey  ,?re  part,al  cluerT  methods  of  the  abstract 
However,  this  can  occur  only  at  program  generation  time,  notat  ^dme"  *  Underflow  e^eption. 

recordsets  o/possible^Tc^fions  3  type  refinement  to 

methods  such  as  first  and  rest.  We  can  introduce  a  c„h tv  ^  t0  record  doma,ns  of  partial 

nonempty  sequences,  and  refine  the  signatures  ofthenartial  ll  T’  S  <  Seq[T’  consisting  of  the 

°  0t  the  partiaI  sequence  operations  first  and  rest  as  follows. 

first(nseq[T,  0]):  T[0] ,  rest(nseq[T,  0]):  seq[T.  01 
irst(seq[T,  0]):  T[seq_underflow],  rest(seq[T,  0]);  Seq[T.  {seq_underflow}] 

template  language  conditioMUF  together  with^ie^le'5  ^  beC3USe  We  have  t0  use  the  guard  of  the 

s  :  seq[T,  S]  and  not  is-empty  (s)  =>  s:  nseq[T,  S] 

This  inference  is  easy  because  the  guard  matches  the  subtype  restriction  predicate  for  nseqfT], 

This  match  did  not  occur  bv  accident  *1,-  _  ~ 

operations  first  and  rest  are  used  only  within  their  domafn  llTr  §Uard  ,‘S  precisely  t0  ensure  that  the 
produce  certifiably  robust  code,  we  claim  that  it  ^  ?d°n'  ln  the  interests  of  being  able  to 

designers  associate  domain  predicates  with  all  partial  7  burdensome  t0  reciuire  that  template 

explicitly  in  guards  whenever  they  are  needed  to  ensure  the^aST’  ^  USe  th°Se  domain  predicates 
domains  of  definition.  For  example.  „„„  „e  «■*  Proper 

first-ok  (seq[TJ) :  boolean  where 
first-ok  (s)  =  not  (is-empty  (s)). 

-csZ'S,- "  si:  ££££ ES ?££  f  -eptions  in 

of  checking  whether  an  unconstrained  guard  condition  ES 1  ¥“'r,ng  s“pport  *c  problem 

partial  operations  is  undecidable.  ^  domam  predicates  of  arbitrary  guarded 

guard  condition  ensures  they^wiU  C‘°SUre  eVen  in  Cases  where  the 

subset  of  efficiently  recognizable  forms  and  to  Il  f  15  m°rC  practlcaI  t0  handle  a  common 

recognizable  forms.  We  believe  this  would  be  less  burden*8"^/0  T*  W,th,n  the  constraints  of  those 
the  cases  where  a  type  check  °f  ’ 

fact  occur,  and  that  it  would  lead  to  a  more  robust  softw^  t  nommate  ^eptions  that  cannot  in 
of  exception  closures.  For  example  we  could  reanirp  th  y  making  it  practical  to  do  complete  analysis 
form  that  looks  like  the  follow!, ^  qU'r£  £Xample  of  Fi8ure  1  to  be  written  in  a  stylized 


IF  first-ok  (c)  and  rest-ok  (c) 

THEN  ...  first  (c) ...  rest  (c) ... 

would  r,m  and  rest  to  ensure  that  they 

— Comparisons  to  Previous  Work- 

to  make  it  independent  of  the  °f  3  program  generation  pattern, 

the  patterns.  The  purpose  of  this  was  to  c%X  S  .  "g  language  and  the  process  of  instantiating 
generation  patterns  becomes  possible  and  in  some^  analy$iS  °f  Pf°gram 

idea.  However,  mlcroTare  no'Sri^ly  diffieduo  3  l0"S  n^'  Macros  are  an  earIy  form  of  the 
uninterpreted  text.  This  makes  th/ connection  between^m^  Y  they  traditional|y  operate  on 

source  of  complexity  becomes  apparent:  a  macro  ct  ^ 


°f  expansi°n  steps  before  the  generated  source  code  actually  appears  is  potentially  unbounded.  This 
makes  the  system  very  difficult  to  analyze.  At  the  other  extreme  are  the  generic  unit's  of  Ada.  These  are 

themSai  S  C  eaii  y  CTeCt£d  t0  l,he  abstract  sy°tax  of  the  language,  and  the  results  of  instantiating 
•  y,  °  ana  yz<f'  revuer- t  l6y  d°  not  a^ovv  conditional  decisions  at  instantiation  time,  and  are 
restricted  m  the  sense  that  the  abstract  syntax  trees  of  all  possible  instantiations  have  exactly  the  same 

Sc^b^foim^Ts?  fZ  theuTal  parameters  of  the  Pattem-  A  language-independent  version  of  the 
idea  can  be  found  in  [5],  although  this  appears  to  be  largely  text-based. 

a  aSfeCt  °f  °Ur  approach  is  t0  model  languages  as  algebras  rather  than  as  abstract  syntax  trees 

7^1"  °fTJ1IS  ldCa  appe!arS  [4]’  although  h  is  not  exploited  there  for  enabling  analysis  to  any  significant 
decree.  The  work  of  the  CIP  group  [1]  develops  this  idea  further  and  takes  advantage  of  the  reasoning 
s  rue  ures  t  at  come  with  the  algebraic  modeling  approach,  such  as  term  rewriting  and  generation 
induction  principles.  This  suggests  extension  to  a  full  object-oriented  view,  which  includes  inheritance. 

e  Refine  system  is  the  earliest  context  we  know  of  where  grammars  are  treated  as  object  models  with 
po  enttal  inheritance  structures,  although  the  documentation  does  not  give  any  hint  about  the  significance 
of  this  capability.  In  this  paper  we  demonstrate  the  usefulness  of  algebraic  models  of  syntax  with 
I'an^ua^es6,  ^  defmmg  anguage  extension  transformations  that  can  be  applied  to  all  possible  target 

Another  theme  is  lightweight  inference  [2],  We  have  demonstrated  that  some  useful  types  of  static 
analysis  for  program  generation  patterns  can  be  performed  via  computable  and  indeed  reasonably 
efficient  methods.  The  processes  described  here  can  be  implemented  using  technologies  typically  used  in 
compilers,  such  as  object  attribution  rules,  they  terminate  for  all  possible  inputs,  anddo  so  in  polynomial 
ime.  e  e  ieve  t  is  approach  will  scale  up  to  large  applications,  and  are  currently  working  out  the 
details  to  support  a  tight  analysis  of  the  efficiency  of  the  process. 

This  paper  has  explored  static  analysis  of  meta-programs  to  check  syntactic  correctness  and 
exception  closure  of  the  generated  code.  Another  kind  of  static  analysis  in  this  familv.  type  checkin*  of 

meta-programs  to  ensure  the  type  correctness  of  the  generated  code,  is  considered  by  another  paper  in  "this 
proceedings  [3].  r  r 


6.  Conclusions 


We  believe  that  formal  models  of  program  generation  templates  can  support  a  variety  of  quality 
improvement  processes  that  can  help  achieve  cost-effective  software  reliability.  This  paper  has  presented 
a  simple  example  of  such  a  formal  model  and  two  such  quality  improvement  processes,  certification  of 
syntactic  correctness  and  freedom  from  unexpected  exceptions  for  all  programs  that  can  be  generated 
from  a  given  program  generation  pattern.  We  expect  the  greatest  advantages  of  this  approach  to  be 
realized  when  it  is  applied  to  realize  flexible  and  reliable  systems  in  a  product  line  approach.  This 
approach  should  be  augmented  with  systematic  methods  for  domain  analysis  that  culminates  in  the 
development  of  a  domain-specific  library  of  solutions  embodied  in  a  domain-specific  software 
architecture  that  is  populated  with  components  produced  by  model-based  software  generators.  When  the 
technology  matures,  it  should  become  possible  for  problem  domain  experts  to  specify  their  problem 
instances  m  terms  of  familiar  problem  domain  models,  and  to  have  reliable  software  solutions  to  their 
problems  automatically  generated,  without  direct  involvement  of  computer  experts. 

The  economic  advantage  of  this  approach  comes  from  the  ability  to  automatically  reap  the  benefits  of 
each  quality  improvement  for  all  past  and  future  instantiations  of  the  template  (if  past  applications  are 
regenerate  ).  Ve  believe  that  it  will  be  profitable  to  explore  methods  for  lifting  many  known  program 
analysis  techniques  from  the  level  of  individual  programs  to  the  level  of  program  generation  patterns. 

his  should  be  explored  for  a  variety  of  issues  that  range  from  certifying  absence  of  references  to 
uninitialized  variables,  absence  of  deadlock,  and  many  others,  perhaps  ultimately  to  template-based  proof 
ot  post  conditions  and  program  termination  for  generated  programs. 

To  make  this  vision  practical,  many  engineering  issues  must  be  addressed,  including  presentation 
issues,  methods  for  lightweight  inference  [2]  and  support  for  transforming  and  enhancina  complex  sets  of 
analysis  rules.  Other  issues  include  systematic  methods  for  dynamic  analysis,  testing,  and  debugging  of 
program  generation  rules.  It  is  not  reasonable  to  expect  progress  to  occur  in  an  instantaneous  quantum 
leap  to  perfection.  A  realistic  process  is  a  gradual  one,  where  simple  sets  of  program  generation  rules  are 
deployed,  and  gradually  tuned,  improved,  certified,  and  extended.  A  key  issue  is  enabling  rule 
enhancement  and  exception  closure  extension  without  invalidating  all  previous  effort  on  analysis  and 
certification  of  the  previous  versions. 


sen erat k> n ' t  ool  s” iT  theT^11  .tlleprogram  generation  approach  proposed  here  and  current  compiler 
LJZ1  f  !  *  assoc,ated  «anc  analysis  capabilities  for  the  program  generation  rules  It  is 

Ultra'reliable  C°mPilerS  WU1  ^  bUiU  ^ived tm 
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ABSTRACT 

Software  reuse  is  widely  considered  to  be  a  way  to 
increase  the  productivity  and  improve  the  quality  and 
reliability  of  new  software  systems.  Identifying,  extracting 
and  reengineering  software  components  that  implement 
abstractions  within  existing  systems  is  a  promising  cost- 
effective  way  to  create  reusable  assets  and  re-engineer 
legacy  systems.  This  paper  summarizes  our  experiences 
with  using  computer-supported  methods  to  develop  a 
software  architecture  to  support  the  re-engineering  of  the 
Janus  Combat  Simulation  System.  In  this  effort,  we  have 
developed  an  Object-Oriented  architecture  for  the  Janus 
Combat  Simulation  subsystem,  and  validated  the 
architecture  with  an  executable  prototype.  In  this  paper, 
we  propose  methods  to  facilitate  the  reuse  of  the  software 
component  of  these  systems  by  recovering  the  behavior  of 
the  systems  using  systematic  methods,  and  illustrate  their 
use  in  the  context  of  the  Janus  System. 


1.  BACKGROUND 

Rapid  changes  in  hardware  and  software  technology, 
combined  with  rapid  changes  in  requirements,  require  new 
methods  to  enable  the  efficient  evolution  of  current 
software  systems.  A  significant  portion  of  these  systems 
are  real-time  control  systems  that  typically  have  rigid 
performance  and  reliability  requirements.  The  ever 
increasing  need  to  integrate  new  requirements  into  these 
systems  poses  a  challenging  problem  for  the  industry  as  it 
strives  to  respond  in  a  timely,  accurate  manner.  There  is  a 
lack  of  reliable  methods  to  maintain  and  evolve  computer 
based  systems. 

Software  reengineering  is  the  process  of 
understanding  existing  software  and  improving  it,  for 
increased  or  enhanced  functionality,  better 
maintainability,  configurability,  reusability,  or  other 
software  engineering  goals.  The  process  involves 
recovering  existing  software  artifacts  and  organizing  them 
as  a  basis  for  future  evolution  of  the  software  system. 
Software  reuse  is  a  popular  way  to  increase  productivity 
and  improve  the  quality  and  reliability  of  new  software 
systems.  Identifying,  extracting  and  reengineering 


software  components  which  implement  abstractions 
within  existing  systems  is  a  promising  cost-effective  way 
to  create  reusable  assets  and  re-engineer  legacy  systems. 

We  have  explored  reuse  in  the  context  of  a  case  study 
that  addresses  the  re-engineering  of  the  Janus  System. 
Janus  is  a  software-based  war  game  that  simulates  ground 
battles  between  up  to  six  adversaries.  It  is  an  interactive, 
closed,  stochastic,  ground  combat  simulation  that  features 
precise  color  graphics.  Janus  is  “interactive”  in  that 
command  and  control  functions  are  entered  by  military 
analysts  who  decide  what  to  do  in  crucial  situations  during 
simulated  combat.  The  current  version  of  Janus  operates 
on  a  Hewlett  Packard  workstation  and  consists  of  a  large 
number  of  FORTRAN  modules,  organized  as  a  flat 
structure  and  interconnected  with  one  another  via 
FORTRAN  COMMON  blocks.  This  software  structure 
makes  modification  of  Janus  very  costly  and  error-prone. 
There  is  a  need  to  modernize  the  Janus  software  into  a 
maintainable  and  evolvable  system  and  to  take  advantage 
of  modem  personal  computers  to  make  Janus  more 
accessible  to  the  Army.  TRAC-Monterey  is  re-engineering 
Janus  into  an  object-oriented  software  system  that  is 
written  in  the  C++  programming  language  and  operates  on 
personal  computers.  Prior  to  rewriting  Janus  in  C++,  the 
software  engineering  group  at  the  Naval  Postgraduate 
School  was  asked  to  extract  the  existing  functionality 
through  reverse  engineering  and  to  produce  an  object- 
oriented  architecture  that  supports  existing  and  required 
enhancements  to  Janus  functionality. 

Software  systems  evolve  as  modifications  are  made  to 
fix  defects  or  to  enhance  functionality.  Software  that  has 
been  involved  in  the  evolutionary  process  for  many  years 
often  reaches  a  state  w'here  a  decision  must  be  made  to 
impose  such  major  changes  to  the  software  that  significant 
re-engineering  is  required.  This  decision  is  typically  based 
on  factors  such  as  the  state  of  deterioration  of  the 
software,  high  modification  costs  resulting  from  reliance 
of  the  software  on  outdated  paradigms,  ineffective 
documentation,  and  obsolescence  of  hardware  platforms 
on  which  the  software  is  housed.  We  have  been 
developing  software  evolution  techniques  for  several 
years,  and  have  applied  them  to  the  Janus  software,  which 
has  many  of  the  features  listed  above. 


This  research  was  supported  by  ARO(38690-MA),  ARO(35037-MA)  and  DARPA(99-F759). 
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The  complexities  associated  with  the  re-engineering 
of  large  complex  systems  and  the  non-availability  of 
effective  conventional  methods  to  address  the 
complexities  suggest  the  need  to  explore  new  research 
directions.  One  of  the  historically  problematic  features  of 
conventional  methods  is  that  the  models  that  are  produced 
are  typically  not  applicable  across  multiple  phases  of  the 
software  development  or  the  software  re-engineering 
process.  The  software  engineer  experiences  both  a 
syntactic  and  semantic  disconnect  from  one  life-cycle 
phase  to  the  next.  Another  problematic  feature  is  that 
current  methods  are  not  sufficiently  automatable  to 
feasibly  support  the  re-engineering  of  complex  systems 
due  to  lack  of  effectively  computable  and  accurate 
methods  for  extracting  and  assessing  the  information  that 
must  be  analyzed.  This  research  focuses  on  enhancing 
software  evolution  by  defining  a  formal  framework  which 
includes  methods  and  representations  that  are  integrable 
across  multiple  phases  of  the  software  evolution  process. 

The  objectives  of  this  paper  are  to: 

•  Describe  a  formal  framework  for  design  recovery. 
Design  recovery  is  a  vital  aspect  of  the  software 
evolution  process.  We  define  a  formal  framework  for 
recovering  design  information  that  facilitates  the 
derivation  of  multiple  higher  level  abstractions  with 
varying  levels  of  formality. 

•  Explore  the  reuse  and  reengineering  method  of  the 
legacy  systems.  The  method  will  help  to  reuse  the 
algorithm  and  data  information  extracted  from  the 
legacy  system  and  reengineering  the  system  and  class 
Structure  through  re-organizing  the  data  and 
functions. 

•  Investigate  specification  representations.  System 
requirements  expressed  with  formal  mathematical 
representations  improve  the  reliability  and 
maintainability  of  a  system  and  extend  the 
opportunities  for  computer  aid.  We  define  a 
methodology  that  facilitates  the  creation  of 
specifications  of  requirements  from  code. 

•  Report  on  our  experiences  in  applying  these  concepts 
to  the  re-engineering  of  the  Janus  system. 


2.  OBJECT  ORIENTED  MODEL 

We  are  developing  a  methodology  that  establishes  a 
formal  foundation  from  which  to  reengineer  systems.  The 
methodology  consists  of  two  major  steps:  the  derivation  of 
object-oriented  design  models  and  the  derivation  of  formal 
specifications  from  the  design  models. 

1)  Object-oriented  design  models:  An  object-oriented 
view  of  a  non  object-oriented  system  provides 
understanding  about  the  behavior  and  relationships  in 
the  system  and  facilitates  the  re-engineering  of  a 
system  to  an  object-oriented  implementation  [I]. 
Object-orientation  is  the  amalgam  of  three  concepts: 
encapsulation,  polymorphism,  and  inheritance. 


Encapsulation  is  realized  as  a  class.  Classes  are 
instantiated  to  create  objects,  which  form  the  basic 
run-time  entity.  Polymorphism  refers  to  the  ability  of 
objects  to  change  type  during  program  execution,  so 
that  generalized  algorithms  can  be  applied  to  many 
types  of  objects.  Inheritance  defines  a  relation 
between  classes  whereby  the  definition  of  a  class  is 
based  on  extending  and  specializing  the  definitions  of 
existing  classes.  It  encourages  the  reuse  of  classes 
that  are  similar  by  allowing  the  tailoring  of  parent 
classes  to  meet  the  needs  of  a  class  with  similar 
requirements  in  a  way  that  meet  the  requirements  of 
the  parent  classes.  Thus,  “inheritance  coupled  with 
polymorphism  and  dynamic  binding  minimizes  the 
amount  of  existing  code  that  must  be  changed  when 
extending  a  system”.  We  have  developed  new 
techniques  to  derive  object  representations  from  non 
object-oriented  code  [2]. 

2)  Specifications:  We  have  developed  a  set  of  high-level 
specification  tools  (CAPS)  that  formally  represent  the 
functionality  of  legacy  systems  in  an  executable  form 
that  supports  prototyping.  A  formal  specification  of  a 
system,  which  is  a  description  of  a  system  using  a 
notation  with  a  precisely  defined  semantics,  provides 
clear  and  precise  communication  of  the  system 
requirements  by  avoiding  the  ambiguities  of  natural 
language,  and  thereby  reducing  design  errors  and 
testing  time.  Benefits  of  CAPS  methods  are  discussed 
at  length  in  [3,  4].  A  process  which  includes  the 
creation  of  a  graphic  representation  of  a  legacy 
system  from  code  will  serve  to  not  only  provide 
structure  and  accurate  documentation  for  the  system 
but  will  also  allow  the  system  to  utilize  the  power  of 
graphic  specifications  for  the  re-engineering  process. 
We  have  derived  methods  to  express  the  functionality 
of  the  legacy  systems  using  graphic  methods. 

The  research  was  motivated  by  the  need  for  better 
techniques  for  the  extraction  and  utilization  of  desirable 
functionality  of  an  existing  system  for  re-engineering, 
reuse,  and  maintenance.  The  work  was  also  motivated  by 
the  recognition  that  graphic  specifications  are  currently 
being  used  successfully  on  a  broad  range  of  applications 
in  industry  because  of  their  potential  to  decrease  software 
costs  and  enhance  software  reliability  by  helping  detect 
errors.  The  abstractions  will  provide  suitable 
representations  from  which  to  forward  engineer  a  system 
and  will  facilitate  the  integration  of  existing  requirements 
with  new  requirements. 


3.  REUSING  AND  RE-ENGINEERING  METHODS 

We  present  a  new  program  slicing  process  for 
identifying  and  extracting  code  fragments  implementing 
functional  abstractions.  The  process  is  driven  by  the 
specification  of  the  function  to  be  isolated,  given  in  terms 
of  a  precondition  and  a  postcondition.  Symbolic  execution 
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techniques  are  used  to  abstract  the  preconditions  for  the 
execution  of  program  statements  and  predicates.  The 
recovered  conditions  are  then  compared  with  the 
precondition  and  the  postcondition  of  the  functional 
abstraction.  The  statements  whose  preconditions  are 
equivalent  to  the  pre  and  postconditions  of  the 
specification  are  candidates  to  be  the  entry  and  exit  points 
of  the  slice  implementing  the  abstraction.  Once  the  slicing 
criterion  has  been  identified  the  slice  is  isolated  using 
algorithms  based  on  dependence  graphs.  The  process  has 
not  been  specialised  for  programs  written  in  the 
FORTRAN  or  C  language.  Both  symbolic  execution  and 
program  slicing  are  performed  by  exploiting  the  Data 
Flow  Graph  (DFG)  and  Control  Flow  Graph  (CFG),  a 
fine-grained  dependence  based  program  representation 
that  can  be  used  for  most  software  maintenance  tasks.  The 
work  described  in  this  paper  is  aiming  to  explore  reverse 
engineering  and  reengineering  techniques  for  reusing 
software  components  from  existing  systems. 


3.1.  PROGRAM  SLICING  AND  INFORMATION 
EXTRACTING 

We  extracted  dependency  and  control  information  to 
enable  the  definition  of  object  models.  This  phase  groups 
together  the  activities  of  source  code  analysis  and 
produces  sets  of  software  components.  Each  one  of  these 
sets  is  a  candidate  to  make  up  a  reusable  module  when 
suitably  de-coupled,  reengineered  and  possibly 
generalised.  This  work  includes  code  structuring,  code 
segmentation,  dependency  analysis,  and  finally 
aggregation  to  produce  design  abstractions. 

We  initiated  the  design  recovery  process  with  a 
preprocessing  step  that  restructures  code.  We  built  on  the 
theory  that  unstructured  code  can  be  written  using  only  D- 
structures  [5]  and  relied  on  existing  algorithms  for  that 
puipose  [6].  Our  research  within  this  phase  involves  the 
use  of  program  slicing  techniques  for  isolating  code 
fragments  implementing  functional  abstractions.  Program 
slicing  has  been  used  both  as  structural  and  specification 
driven  method.  As  structural  method,  program  slicing  has 
been  used  to  identify  external  user  functionalities  in  large 
programs.  The  isolation  of  an  internal  domain  dependent 
function  can  be  driven  by  its  formal  specification.  The 
specification  can  be  used  together  with  symbolic 
execution  techniques  to  identify  a  suitable  slicing 
criterion.  Code  segmentation  is  needed  in  order  to  reduce 
the  granularity  and  thus  the  complexity  of  the  remaining 
processes.  We  have  defined  a  segmentation  scheme  that 
separates  the  code  into  modular  units  while  also  removing 
syntactic  sugar  features  of  the  code.  We  have  also  defined 
heuristics  to  attach  in-code  documentation  to  the 
appropriate  segment.  For  a  program  P  the  result  is  a  set  of 
segments,  such  that  SG  =  {sgj,  sg2, ...  sgn}  and  Pf  =  usgi? 
where  1  <  i  <  n  and  Pf  represents  code  that  is  identical  in 
functionality  to  P. 


Following  the  segmentation,  we  defined  dependency 
algorithms  that  analyze  each  sgj.  Specific  slicing 
algorithms  that  are  modified  forms  of  the  slicing 
algorithms  found  in  [7]  are  employed  at  the  statement, 
construct,  and  block  levels.  These  algorithms  provide 
information  on  all  variables:  local  variables,  non-local 
variables,  array  variables,  and  data  typing. 

The  results  of  the  restructuring,  segmentation,  and 
dependency  steps  are  segment  design  representations  and 
a  global  design  representation.  These  representations 
include  traditional  methods,  such  as  call  graphs,  structure 
charts,  and  hierarchical  diagrams  and  other  less 
conventional  representations  such  as  variable  usage  and 
state  change  descriptions.  These  representations  serve  as 
input  to  perform  object  identification  and  to  create  formal 
specifications  of  object  behavior. 

Results  of  our  work  include  methods  that  recover  the 
design  information  at  varying  levels  of  granularity, 
expressible  in  numerous  forms  from  both  data  and 
functional  viewpoints.  The  data  and  control  dependency 
representations  are  the  basis  for  our  object  extraction 
research. 


3.2.  REUSABLE  COMPONENT  CONSTRUCTING 

This  phase  groups  together  the  activities  of  the 
analysis  of  the  bag  of  software  reusable  component  sets 
singled  out  in  the  Program  Slicing  and  Information 
Extracting  phase  and  produces  a  set  of  reusable  modules, 
using  reengineering  techniques.  Also,  this  phase  groups 
together  the  activities  that  produce  the  specifications  of 
each  one  of  the  reusable  modules  obtained  in  this  phase. 
Both  the  functional  and  the  interface  specifications  must 
be  produced  in  this  phase.  We  used  object-oriented  and 
prototyping  techniques  to  abstract  ' a  formal  specification 
from  source  code  modules  implementing  functional 
abstractions.  Finally,  we  need  to  classify  the  reusable 
modules  and  related  specifications  according  to  a 
reference  taxonomy.  The  aim  is  to  re-engineer  legacy 
systems  with  the  reusable  modules  produced. 

Program  comprehension  is  the  most  expensive 
activity  of  software  maintenance.  The  different  phases  of  a 
reuse  reengineering  process  involves  comprehension 
activities  for  understanding  the  structure  of  existing 
systems,  the  functionality  implemented  by  a  reuse 
candidate  module  and  the  reengineering  effort.  We  present 
a  method  for  reuse  reengineering  existing  FORTRAN  or 
C  systems.  Our  goal  is  to  create  reusable  software 
components  with  object-oriented  methods. 

The  problem  of  extracting  encapsulated  reusable 
software  components  from  legacy  systems  is  an  area  of 
active  research.  The  concept  of  the  object  module  as  a 
means  of  restructuring  FORTRAN  code  into  an  object- 
oriented  style  was  introduced  in  [8].  While  code  structured 
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as  object  modules  is  not  truly  object-oriented,  it  marked 
the  beginning  of  progress  along  that  path.  The  problem  of 
object  identification  has  been  approached  by  first 
developing  a  formal  specification  of  the  code  and  then 
identifying  objects  from  the  formal  specification  in  some 
methods  [9].  In  an  informal  approach,  Sward  translates 
code  to  natural  language  descriptions  and  then  applies 
object-oriented  analysis  and  design  techniques,  such  as 
OMT,  to  create  the  object  design  [10].  A  design  recovery 
approach  which  automatically  extracts  task  flow 
information  utilizing  both  source  code  and  non-source 
code  information  is  found  in  Holtzblatt’s  work  [11].  Other 
research  that  addresses  behavior  abstraction  includes 
object  extraction  and  translation  to  C++  using  data  flow 
analysis  [12],  partial  evaluation  for  code  comprehension 
[13],  and  development  of  new  Ada  programs  by  reusing 
FORTRAN  code  [14]. 

In  other  related  work,  a  complete  translation  to  an 
intermediate  form  in  the  UNIFORM  language  is  used  in 
Lano’s  work  [15]  as  a  bridge  to  a  functional  description 
language  and  then  finally  to  a  Z  specification.  In  some 
methods,  COBOL  code  is  reverse  engineered  to  Z++  and 
then  reengineered  to  COBOL  code  using  refinement  as  a 
part  of  the  REDO  (Re-engineering,  Documentation,  and 
Validation  of  Systems)  project  [16].  A  transformation 
process  that  creates  C++  code  from  COBOL  code  is  given 
in  [17].  Other  work  on  reverse  engineering  of  COBOL 
systems  to  SSADM  specifications  is  a  part  of  the 
RECAST  (Reverse  Engineering  into  CASE  Technology) 
method  in  which  information  extracted  from  source  code 
is  represented  in  PSL  to  eventually  produce  input  for  the 
physical  design  phase  of  SSADM  [18].  In  an  approach 
that  requires  a  large  set  of  transformations,  Ward 
translates  assembler  code  to  a  wide-spectrum  language 
(WSL)  which  contains  primitive  statements,  such  as 
assertions  and  guards;  compound  statements,  including 
sequential  composition,  choice  and  recursive  procedures; 
and  other  language  extensions  including  a  command 
language,  loops  with  multiple  exits,  and  mutually 
recursive  procedures  [19].  In  some  approaches,  code 
semantics  are  expressed  as  logic  specifications  [20]. 

Research  that  involves  the  extraction  of  modules  and 
reusable  components  from  legacy  code  includes 
algorithms  that  construct  a  hierarchical  structure  from  an 
implementation  description  [21],  methods  to  identify 
abstract  data  types  based  on  user  defined  data  types  [22], 
direct  slicing  to  extract  specific  types  of  code  segments 

[23] ,  identification  of  cliches  to  recover  program  design 

[24] ,  program  segmentation  based  on  focusing  and 
factoring  operations  on  COBOL  code  [25,  26],  and 
component  identification  based  on  formal  parameter  types 
and  global  variables  [27].  Methods  to  abstract  the 
behavior  of  programs  by  deriving  mathematical 
expressions  from  prime  programs  are  found  in  Hausler’s 
work  [28],  An  enabling  technology  which  represents 
software  in  the  form  of  annotated  abstract  syntax  trees  in  a 
persistent  object-oriented  database  and  then  uses  an 


executable  specification  language  for  analysis  is  described 
in  Markosian’s  work  [29]. 

We  use  an  incremental  approach  based  on  graph- 
theoretic  and  set-theoretic  concepts.  We  have  investigated 
reusable  component  constructed  from  procedural  code  to 
produce  intermediate  representations  from  functional  and 
data  viewpoints.  We  then  use  the  intermediate 
representation  to  define  a  high-level  object  view  of  the 
legacy  system.  Our  code  and  concept  abstraction  methods 
include  the  identification  of  candidate  objects  along  with 
their  associated  attributes  and  methods. 

Our  object  extraction  algorithms  are  based  on  the 
following  object  model  for  object  O: 

O  =  <A,  MD> 

A  ”  {Aj,  A2, . .  An} 

MD=  {MD,,MD2, . .  MDm} 

where  A  represents  attributes  and  MD  represents 
operations  that  act  on  members  of  A.  Our  approach  is  both 
data-driven  and  bottom-up.  The  granularity  of  a  program 
is  viewed  at  the  program,  subroutine,  and  statement  levels; 
however,  the  primary  focus  for  the  unit  of  functionality  is 
the  subroutine.  Using  the  parameters  necessary  for  the 
execution  of  each  subroutine,  the  goal  is  to  find  the 
smallest  set  of  parameters  needed  to  obtain  the  strongest 
cohesive  unit,  which  becomes  a  candidate  set  of  attributes 
for  an  object  type. 

We  use  a  greedy  approach  to  the  derivation  of  the  A 
component  of  O  which  considers  both  actual  parameters 
and  global  variables.  To  partition  the  set  of  actual 
parameters,  AP,  where 

AP={AP1,AP2. . ,APn} 

a  graph-theoretic  approach  is  used.  We  define  an 
undirected  graph  G  with  nodes  APj,  1  <i  <  n  and  with 
edges  connecting  APj  and  APj  if  the  two  parameters  both 
occur  in  at  least  one  subroutine  call.  A  weight  function, 
W,  is  then  defined  to  give  values  to  the  edges  of  G.  W  is 
computed  for  all  pairs  of  parameters,  APj,  APj  €  AP,  with 
respect  to  each  subroutine  invocation.  A  constant  is  used 
to  indicate  positive,  negative  or  null  contribution  to 
cohesiveness.  We  define  a  weighted  adjacency  matrix  M 
where  the  value  of  each  M(i,  j)  is  the  cumulative  value  of 
W(AP|,  APj)  over  all  subroutine  invocations.  Thus,  M(i,  j) 
represents  a  measure  of  the  degree  to  which  parameters 
APj  and  APj  are  functionally  related. 

Following  the  derivation  of  the  weighted  adjacency 
matrix,  an  initial  set  of  object  attributes  is  determined  by 
using  a  threshold  approach.  The  potential  threshold  values 
are  the  non-negative  real  numbers  r,  such  that  r  e  M.  For 
each  r,  the  transitive  closure  is  computed  to  obtain  the 
attribute  sets  that  are  related  at  that  threshold  level.  The 
objective  is  to  select  the  threshold  level  that  produces  the 
largest  data  sets  with  the  strongest  cohesion.  Domain 
knowledge  used  by  a  design  engineer  is  encouraged  for 
the  selection  of  the  optimal  threshold  level. 
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Building  on  the  actual  parameter  analysis,  a  similar 
approach  for  determining  strength  among  global  variables 
is  used.  Issues  related  to  the  global  variables,  including 
aliasing,  were  resolved.  After  the  determination  of  the 
attributes,  the  method  component  for  an  object  is 
determined.  We  use  a  state  change  approach  to  attach 
methods  to  objects.  In  order  to  derive  the  state  change 
information  needed,  we  modified  the  concept  of  program 
slicing  from  its  original  definition  in  Weiser’s  paper  [7]. 
We  perform  slicing  for  each  attribute  set  on  a  subset  of  the 
subroutines  and  the  resultant  set  becomes  a  method  in  the 
corresponding  object. 

The  result  of  applying  these  algorithms  is  a  set  of 
candidate  objects.  Class  abstractions  need  to  be  defined 
over  this  set  to  take  advantage  of  the  abstraction  and 
inheritance  features  of  object  orientation.  We  have  only 
begun  to  investigate  the  class  abstraction  process. 
Enhancement  of  the  class  abstraction  methods  is  a  part  of 
our  ongoing  work. 


33.  JANUS  (A)  CASE  STUDY 

The  objective  of  the  case  study  was  to  re-engineer  an 
object-oriented  architecture  for  the  Janus(A)  legacy 
system.  The  first  step  in  our  process,  system  and 
requirements  understanding,  took  the  form  of  a  series  of 
brief  meetings  with  the  client,  TRAC-Monterey,  which 
also  included  a  short  demonstration  of  the  current 
software  system.  We  asked  questions  and  made  notes  on 
the  system’s  operation  and  its  current  functionality.  We 
paid  particular  attention  to  the  client’s  view  of  the  system 
to  gather  their  ideas  on  its  strengths,  weaknesses,  and 
desired  and  undesired  functionality.  Additionally  we 
collected  copies  of  the  Janus  User’s  Tutorial  manual, 
Janus  User  Manual,  the  Software  Design  Manual  from  a 
previous  version  of  Janus  (3.X/UNIX),  and  the  Janus 
Version  6.88  Release  Notes.  Our  goal  was  to  gather  as 
much  information  as  we  could  about  the  currently  existing 
system  to  aid  in  gaining  a  clearer  understanding  of  its 
present  functionality.  The  intent  of  this  procedure  was  to 
ensure  that  the  system’s  current  functionality  was  not  lost 
nor  misrepresented  in  the  transformation  into  a  more 
abstract,  modular  format,  and  to  identify  aspects  of  current 
system  functionality  that  did  not  match  user  needs. 

The  focus  of  the  re-engineering  effort  was  to 
abstractly  capture  the  system’s  functionality  and  then 
produce  system  models  that  would  most  accurately 
represent  that  functionality,  while  factoring  out 
independent  concerns  and  aspects  that  were  likely  to 
change. 

Armed  with  the  Janus  source  code,  we  proceeded  to 
divide  the  code  by  directories  amongst  the  team  members. 
Each  team  member  was  assigned  roughly  six  to  seven 
directories  to  explore,  examine  and  gather  information. 
Using  manual  techniques  supported  by  UNIX  commands 


and  review  procedures,  we  were  able  to  get  a  fairly  good 
idea  of  what  each  subroutine  was  designed  to  do.  We  also 
used  the  Software  Programmers’  Manual  to  aid  in 
understanding  each  subroutine’s  intended  function.  In 
doing  so  we  were  able  to  group  the  subroutines  by 
functionality  to  get  a  better  understanding  of  the  major 
data  flows  between  programs. 

Using  that  knowledge,  we  developed  functional 
models  from  the  data  flows.  We  used  an  automated  tool 
known  as  CAPS  [3],  Computer-Aided  Prototyping 
Systems,  version  2.0,  developed  by  Professor  Luqi  and  the 
Software  Engineering  group  at  the  Naval  Postgraduate 
School,  to  assist  in  developing  the  abstract  models.  CAPS 
.allowed  us  to  rapidly  graph  the  gathered  data  and 
transform  it  into  a  more  readable  and  usable  format. 
Additionally,  CAPS  enabled  us  to  develop  our  diagrams 
separately  with  the  associated  information  flows  and 
stream  definitions,  and  then  join  them  together  under  the 
CAPS  environment,  where  they  can  be  used  to  generate  an 
executable  model  of  the  architecture. 

Next,  w'e  proceeded  to  develop  object  models  of  the 
Janus  System  using  the  aforementioned  materials  and 
products,  to  create  the  modules  and  associations  amongst 
them.  This  was  probably  the  most  difficult  and  most 
important  step.  It  required  a  great  deal  of  analysis  and 
focus  to  transform  the  currently  scattered  sets  of  data  and 
functions  into  small,  coherent  and  realizable  objects,  each 
with  its  own  attributes  and  operations.  In  performing  this 
step,  we  used  our  knowledge  of  object-oriented  analysis 
and  applied  the  OMT  techniques  and  the  UML  notations 
to  create  the  classes  and  associated  attributes  and 
operations.  This  was  a  crucial  step  because  we  had  to 
ensure  that  the  classes  we  created  accurately  represented 
the  functions  and  procedures  currently  in  the  software. 
We  used  the  HP-UNIX  systems  at  the  TRAC-Monterey 
facility  to  run  the  Janus  simulation  software  to  aid  in 
verifying  and/or  supplementing  the  information  we 
obtained  from  reviewing  the  source  code  and 
documentation.  This  step  enabled  us  to  better  analyze  the 
simulation  system,  gaining  insight  into  its  functionality 
and  further  concentrate  on  module  definition  and 
refinement. 

During  this  phase  of  the  project,  the  re-engineering 
team  met  several  times  each  week  for  a  period  of  two  and 
a  half  months  to  discuss  the  object  models  for  the  Janus 
core  elements  and  the  object-oriented  architecture  for  the 
Janus  System.  They  presented  the  findings  to  the  Janus 
domain  experts  from  TRAC-Monterey  and  Rolands  & 
Associates  at  least  once  per  week  to  get  feedback  on  the 
models  and  architectures  being  constructed.  In  addition, 
the  re-engineering  team  also  presented  the  findings  to 
members  of  the  OneSAF  project,  the  Combat21  project, 
and  the  National  Simulation  Center.  Based  on  the 
feedback  from  the  domain  experts,  the  re-engineering 
team  revised  the  object  models  for  the  Janus  core  elements 
and  developed  a  3-tier  object-oriented  architecture  for  the 
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Janus  System.  This  revision  required  creative  human 
effort,  as  described  next. 

We  used  our  approach  to  reuse  the  information 
extracted  from  the  old  system.  The  most  important  type  of 
reuse  was  reuse  of  implicit  domain  models.  We  reused  the 
domain  analysis  and  knowledge  since  the  domain  was 
stable  across  the  re-engineering  transformations.  This 
greatly  reduced  the  time  and  effort  that  needed  be  spent  on 
domain  related  work,  such  as  the  analysis  of  the  domain 
dependent  functions.  Second  was  reuse  of  implementation 
concepts.  This  kind  of  reuse  included  the  user 
functionalities,  functional  abstraction,  task  flow,  and  user 
interface  specifications.  Third  was  the  reuse  of  data 
models.  The  reuse  of  data  models  was  very  helpful  to  re¬ 
organize  the  data  information  although  we  needed  to 
transform  the  old  data  structures  into  new  data  structures. 
Fourth  was  the  reuse  of  algorithms.  The  code  could  not  be 
reused  directly  because  it  had  to  be  transformed  into 
another  language  (Ada).  However,  the  main  algorithms 
were  the  same  -  we  did  not  need  to  redesign  the 
algorithms,  we  just  rewrote  them  in  new  languages. 

The  new  architecture  of  Janus  uses  an  explicit  priority 
queue  of  event  objects  to  schedule  the  simulation  events. 
Each  event  object  has  an  associated  simulation  object, 
which  is  the  target  of  the  event.  There  are  14  event  groups^ 
which  correspond  to  the  14  event  subclasses.  An  object 
oriented  approach  enabled  us  to  reduce  the  number  of 
event  types  needed  in  the  simulation,  compared  to  the 
legacy  code.  Depending  on  the  subclass  to  which  an  event 
object  belongs,  the  "execute”  method  will  invoke  the 
corresponding  event  handler  of  the  associated  simulation 
object  to  handle  the  event.  The  simulation  object 
superclass  defines  the  interface  of  the  event  handlers  for 
the  event  groups,  and  provides  an  empty  body  as  the 
default  implementation  for  the  event  handlers.  The 
methods  are  overridden  by  the  actual  event  handler  code 
at  the  subclasses  that  have  non-empty  actions  associated 
with  the  events. 

This  approach  enables  the  same  code  to  handle  all 
kinds  of  events,  including  those  for  future  extensions  that 
are  yet  to  be  designed.  Event  objects  are  created  and 
inserted  into  the  event  queue  either  by  the  initialization 
procedure  at  the  beginning  of  the  simulation,  by  the 
constructors  of  simulation  objects,  or  by  the  actions  of 
other  event  handlers.  Depending  on  the  actual 
implementation  of  when  and  how  events  are  inserted  into 
the  priority  event  queue,  it  may  be  necessary  to  allow 
events  to  change  their  priorities  while  waiting  in  the 
queue.  The  priority  of  an  event  is  determined  by  the  time 
at  which  the  event  is  supposed  to  occur,  and  by  event  type 
in  case  more  than  one  event  is  scheduled  at  the  same  time. 

One  of  the  objectives  of  the  reengineering  effort  was 
to  add  the  capability  for  a  Janus  simulation  to  interact  with 
other  simulations  in  a  distributed  environment.  To 
accomplish  this,  World  Model  object  subclasses  were 


created  to  provide  specialized  methods  for  the  world 
modeler  to  update  objects  from  other  simulators. 
Information  concerning  objects  local  to  the  Janus 
simulator  can  be  broadcast  over  the  simulation  network, 
either  periodically  by  an  active  world  modeler  object,  or 
by  individual  local  objects  whenever  they  update  their 
own  states. 


3.4.  EVALUATION  OF  RESULTS 

We  tested  our  methods  for  identifying  objects  on  a  set 
of  programs  ranging  from  500  lines  of  code  to  10,000 
lines  of  code.  As  a  part  of  our  test  bed,  we  used  programs 
from  the  Janus  (A)  which  were  developed  by  DoD.  Our 
test  protocol  was  to  begin  the  testing  process  with  small 
programs  so  that  the  dependency  and  slicing  information 
could  be  validated  manually.  The  testing  strategy  was  to 
choose  test  programs  that  exhibit  different  code 
characteristics,  particularly  related  to  the  use  of  global 
variables.  We  were  able  to  manually  verify  the  accuracy 
of  the  extraction  routines  on  small  systems. 

We  then  applied  the  methodology  to  medium-sized 
programs  and  evaluated  the  results.  Our  evaluation 
process  included  the  identification  of  a  set  of  metrics 
against  which  to  measure  the  designs.  Metrics  in  the 
reverse  engineering  area  are  sparse.  We  adopted  the 
approach  of  measuring  our  success  using  the  following 
,  three  metrics: 

MA  Functional  equivalence  of  newly  created  and  original 
designs . 

M.2  Quality  of  newly  created  design . 

M3  Reuse  rate  of  the  original  program. 

M.l  Functional  equivalence  of  newly  created  and 
original  designs 

The  design  of  a  program  SI  is  functionally  equivalent  to 
the  design  of  program  S2  if  when  they  are  executed  with 
identical  inputs,  they  produce  identical  outputs.  This  is  a 
critical  measure.  To  assess  the  functional  equivalence  of 
our  abstracted  designs,  we  implemented  the  designs  in  an 
object-oriented  environment  and  then  ran  test  cases  on  the 
new  and  the  old  systems.  Based  on  our  test  cases,  the  test 
systems  were  functionally  equivalent. 

M.2  Quality  of  new  designs 

Our  view  of  a  significant  metric  is  the  quality  of  the 
resulting  design;  however,  measuring  quality  is  far  from 
straightforward.  We  based  our  findings  in  this  area  on  the 
traditional  view  of  design  quality  in  terms  of 
modifiability,  modularity,  levels  of  abstraction,  loose 
coupling,  and  high  cohesion  [30].  We  also  considered 
metrics  that  have  been  derived  specifically  for  object- 
oriented  designs,  including  depth  of  inheritance  tree 
(DIT),  number  of  children  (NO C),  response  for  a  class 
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(RFC),  and  lack  of  cohesion  in  methods  (LCOM)  [31]. 
Coupling  can  be  measured  by  DIT  and  NOC;  cohesion 
can  be  measured  by  LCOM;  abstraction  measured  by  DIT 
and  NOC;  modifiability  can  be  measured  by  RFC  and 
LCOM;  and  modularity  measured  by  DIT  and  NOC. 

For  our  case  studies,  we  found  low  measures  for  both 
DIT  and  NOC  which  is  expected  based  on  the 
conservative  view  of  creating  the  subclasses,  medium 
measure  for  RFC  due  to  global  variable  usage,  low  LCOM 
because  the  methodology  insures  cohesion  in  the  creation 
of  the  objects.  Thus,  the  designs  were  low  on  coupling, 
high  on  cohesion,  and  generally  good  on  modifiability. 

M.3  Reuse  rate  of  orginal  program 

Reuse  rate  of  the  program  is  measured  by  the  percent  of 
the  program  that  is  actually  utilized  in  the  extraction 
process.  If  reuse  rate  is  not  100%,  one  of  two  cases 
occurs:  1)  some  of  the  system  functionality  may  not  be 
preserved,  or  2)  statements  not  extracted  represent  dead 
code.  However,  100%  reuse  rate  does  not  imply  functional 
equivalence,  and  vice  versa.  The  reuse  rate  for  our  test 
programs  was  in  all  cases  greater  than  40%.  This  measure 
gives  another  perspective  from  which  to  assess  the  quality 
of  the  newly  created  design  abstractions. 


4.  CONCLUSION 

Successful  re-engineering  requires  a  delicate  balance 
between  creative  concepts  for  requirements  enhancement 
and  computer  aid.  Bottom-up  tools  can  help  guide  this 
creative  process  and  help  to  ensure  its  accuracy. 

Our  experience  in  this  case  study  suggests  that 
prototyping  and  reuse  can  be  a  valuable  aid  in  re¬ 
engineering  of  legacy  systems,  particularly  in  cases  where 
radical  changes  to  system  conceptualization  and  software 
structure  are  needed. 

In  particular,  we  found  that  constructing  even  a  very 
thin  skeletal  instance  of  the  proposed  new  architecture 
raised  many  issues  and  enabled  us  to  correct,  complete, 
and  optimize  the  architecture  for  both  simplicity  and 
performance. 

The  computer-aided  prototyping  tools  in  the  CAPS 
system  enabled  us  to  do  this  with  a  minimal  amount  of 
coding  effort.  The  bulk  of  the  code  was  generated 
automatically,  enabling  us  to  concentrate  on  system 
structuring  issues,  to  consider  and  evaluate  various 
alternatives,  and  to  improve  the  design  while  doing 
detailed  manual  implementation  for  only  a  few  pages  of 
critical  code. 
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Abstract 

Reuse  libraries  are  organizations  of  personnel, 
procedures,  tools,  and  software  components  directed 
toward  facilitating  software  component  reuse  to  meet 
specific  cost-effectiveness  and  productivity  goals.  The 
paper  gives  a  sun’ey  of  the  major  software  reusable 
component  repositories.  This  survey  will  be  a  base  to 
develop  future  efficiently  searchable,  user-friendly,  useful, 
and  well-organized  repositories. 

1,  Introduction 

Reuse  libraries  are  directed  toward  facilitating 
software  life  cycle  component  reuse  to  meet  cost- 
effectiveness  and  productivity  goals  [1].  The  principal 
rationale  for  the  existence  of  a  reuse  library  is  to  provide 
ready  access  to  reusable  components  by  the  staff  of 
development  and  maintenance  organizations,  and  to 
support  system  composition  and  rapid  prototyping  [2,  3]. 
The  number  of  cases  in  which  library  systems  are 
successfully  being  used  to  maintain  code  and  other 
reusable  software  life  cycle  components  continues  to 
increase.  It  is  essential  that  the  library  system  support 
developers  and  other  users  in  the  process  of  locating, 
retrieving,  comparing,  and  maintaining  reusable  software 
components. 

Reuse  libraries  are  only  one  critical  element  of 
successful  reuse  program.  In  the  past,  reuse  has  primarily 
been  the  result  of  opportunistic  success,  where  one 
program  was  able  to  take  advantage  of  the  efforts  of 
another.  There  must  be  a  paradigm  shift  from  current 
software  engineering  and  development  practices  to  a 
software  engineering  process  in  which  software  reuse  is 
institutionalized  and  becomes  an  inseparable  part  of  the 
software  development  process.  Reuse  must  be  systematic, 
driven  by  a  demand  for  software  components  identified  as 
a  result  of  domain  analysis  and  architecture  development. 
Reuse  needs  to  be  treated  as  an  integral  part  of  engineering 
and  acquisition  activities.  Most  importantly,  it  is  essential 
that  an  organizational  infrastructure  be  implemented  to 


manage  domains,  define  products  and  standards,  establish 
ownership  criteria,  allocate  investment  resources,  and 
direct  the  establishment  and  population  of  reuse  libraries. 
An  effective  infrastructure  will  guide  reuse  activities  to 
avoid  duplication  of  effort,  impose  necessary 
standardization,  and  ensure  library  population  is  user 
demand-driven. 

2.  Library  Mechanism 

Usually,  critical  reuse  library  capabilities  include  the 
following: 

automated  library  system  with  a  Graphical  User 
Interface,  for  browsing,  searching  and  retrieval; 
standard  component  framework  (e.g.,  to  include 
purpose,  functional  description,  certification  level,  key 
environmental  constraints,  historical  results  of  usage 
and  legal  restrictions); 

effective  classification  scheme  for  each  domain;  and, 
thorough  system  and  component  documentation. 

Each  library  system  must  be  designed  to  provide  as 
much  automated  support  as  possible  to  users  in 
identification,  comparison,  evaluation,  and  retrieval  of 
similar  reusable  components.  Support  for  adapting, 
transforming,  and  specializing  components  is  desirable.  It 
must  also  provide  a  range  of  support  to  users  in  locating 
and  comparing  the  relative  reusability  of  individual  library 
components.  Furthermore,  the  system  must  be  readily 
available  to  system  developers  if  it  is  to  be  used,  and  must 
support  access  from  a  variety  of  platforms.  As  the  library 
acquires  significant  number  of  Reusable  Software 
Components  (RSCs),  an  automated  search  and  retrieval 
system  becomes  indispensable  [4,  5,  6].  Whatever  tool  is 
used,  the  library  must  have  a  way  to  classify  RSCs  so  that 
a  user  can  quickly  find  what  is  wanted  without  frustration 
and  delay.  Sophisticated,  expert  system,  knowledge-based 
approaches  and  new  technologies  for  high-speed  text 
search  are  the  subjects  of  current  research  efforts. 
Generally  speaking,  software  reusable  component  retrieval 
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methods  include  browsing,  keyword  searching,  facet 
approach,  syntactic  matching,  and  semantic  matching  [1], 

Standard  component  frameworks  help  ease  the 
process  of  comprehension  and  comparison  of  similar 
components,  and  include  data  such  as  relative  numeric 
measures  for  reusability,  reliability,  maintainability  and 
portability  [7].  Inclusion  of  testing  and  component 
documentation  provides  additional  information  to  help  the 
potential  user  gauge  the  effort  required  to  tailor  the 
component  for  reuse. 

Effective  classification  schemes  are  essential  to  assist 
the  user  in  locating  and  comparing  library  components, 
and  to  speed  the  process  of  identifying  appropriate 
components  for  the  task  at  hand  [8,  9],  Finally,  system  and 
component  documentation  complete  the  cycle  of 
evaluation,  and  enable  the  reuser  to  determine  which 
components  have  reuse  potential  with  regard  to  specific 
requirements,  and  to  fully  comprehend  the  process  of 
obtaining  components  for  reuse  in  a  new  application. 

In  addition,  other  equally  important  requirements  have 
been  identified  that  require  resolution  in  order  to  support 
cohesive,  wide  reuse.  These  include  I)  integration  of 
library  capabilities  and  procedures  within  the  system 
development  and  acquisition  process;  2)  identification  and 
support  of  specific  requirements  associated  with  the 
security  and  integrity  of  reusable  components 
implementing  Trusted  Computing  Base  (TCB)  or  other 
security  capabilities;  and  3)  intercommunication  and 
interoperability  among  diverse  library  systems.  Experience 
has  shown  that  these  requirements  can  only  be  resolved 
through  the  combination  of  developing  technology, 
standard  procedures  and  evolution  or  revision  of  existing 
policies. 

There  are  different  communities  for  w'hich  a 
repository  is  necessary,  and  each  community  has 
somewhat  different  repository  requirements.  These 
communities  include  the  national  or  horizontal 
communities;  the  local  or,  internal  communities,  and  a 
number  of  domain-specific  vertical  communities  [10]. 

3.  Library  Operation 

The  reuse  library,  while  essential,  is  but  one 
ingredient  in  a  successful  reuse  program.  Experience  has 
shown  that  actual  support  of  reuse  activities  within  a  target 
domain  must  include  a  range  of  programmatic  and 
technological  support  that  includes  domain  analysis 
activities,  user  indoctrination  and  training,  metrics 
collection  and  analysis,  reuse  engineering  support,  and 
component  certification  and  reengineering. 

The  importance  of  domain  analysis  activities  as  an 
initial  step  in  implementation  of  a  reuse  library  cannot  be 
over-emphasized.  Domain  analysis  activities  are 
considered  to  be  an  integral  part  of  providing  reuse  support 
to  various  programs.  Standard  products  of  domain  analysis 
include  identification  of  high-demand  categories  of 


reusable  components,  domain-specific  models  and 
architectures,  and  domain  specifications  and  taxonomies. 
These  direct  products  also  provide  the  basis  for 
development  of  long-term  implementation  plans  and 
domain  knowledge  bases. 

In  order  to  measure  reuse  success,  the  library  must 
collect  and  analyze  considerable  data  in  a  continuing 
assessment  of  the  library’s  procedures  and  tools,  the 
usefulness  of  its  RSC  collection,  the  accuracy  of  RSC 
classifications,  and  the  general  responsiveness  of  the 
library  to  the  needs  of  users. 

The  library  staff  receives  direction  in  the  form  of 
specific  operational  objectives,  principally  aimed  at 
making  software  reuse  cost-effective.  In  addition  to 
ensuring  that  RSCs  are  available,  the  library  is  in  a 
position  to  provide  other  support  to  help  ensure  that  the 
benefits  of  reuse  are  realized,  including  the  distribution  of 
published  manuals  like  Standards  and  Guidelines  and  user 
documentation  for  library  tools.  In  addition,  on-call 
assistance  should  be  made  available  to  users.  Reuse 
engineering  support  encompasses  a  wide  range  of 
engineering  activity.  These  activities  will  include  working 
within  individual  system  development  and  maintenance 
efforts  to  assist  in  (1)  identification,  selection  and 
reapplication  of  existing  reusable  software  components, 
(2)  quantification  of  potential  savings  or  cost  avoidance  as 
a  result  of  reuse,  and  (3)  design  and  implementation  of 
software  products  that  will  themselves  be  reusable  in 
future  efforts. 

Another  key  area  is  thorough  library  system 
documents.  Documentation  has  proven  to  be  an  essential 
aspect  in  establishing  and  operating  a  library. 

4.  Some  Reusable  Software  Component 
Repositories 

4.1.  Commercial  Repositories 
•  +1  Reuse  Repository 

The  +1  Reuse  system  was  developed  by  +1  Software 
Engineering  Co.  in  California  [31].  It  is  now  running  on 
Sun  Workstation  platforms.  Operating  system  is  Solaris. 
GUI  is  based  on  OpenWindows,  Motif,  and  CDE. 

The  +1  Reuse  system  supports  reuse  repositories 
created  and  maintained  by  the  user,  project-wide  "filtered" 
repositories  under  strict  quality  controls,  and  selective 
reuse.  Selective  reuse  enables  reuse  of  any  submodel  from 
an  existing  or  re-engineered  +1  Environment  project.  In  a 
sense,  every  +1  Environment  project  is  a  reuse  library. 
Selective  reuse  significantly  improves  a  user’s  ability  to 
reuse  all  source  code  and  documentation  from  all  previous 
projects  and  at  any  granularity.  (To  the  best  of  our 
knowledge,  they  are  the  only  company  to  support  this 
feature.) 
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The  +1  Reuse  system  supports  reuse  of:  design, 
documentation,  source  code,  header  files,  test  cases,  test 
shell  scripts,  expected  test  results,  and  modeling 
information. 

All  source  code  reversed  engineered  or  developed 
using  the  +1  Environment  can  be  reused.  +1 Reuse 
addresses  reuse  issues  such  as  reuse  of  source  code  under 
configuration  management  and  duplicate  file  names. 
*rl  Reuse  supports  three  forms  of  reuse:  User-Defined 
Reuse  Library,  Filtered  Reuse  Library,  and  Selective 
Reuse.  Since  a  programmer's  productivity  can  be  increased 
by  reusing  existing  code  and  documentation,  +1  Reuse 
helps  to  make  all  source  code,  documentation,  header  files, 
and  test  files  reusable  by  its  support  of  submodels.  After  a 
submodel  has  been  selected,  +1  Reuse  copies  the  submodel 
and  its  associated  files  to  the  new  project  and  helps  to 
resolve  a  number  of  problems  which  may  arise  (e.g., 
identical  file  names  and  files  checked  in  under 
configuration  management). 

•  Software  Asset  Library  Management  System 
(SALMS) 

SALMS  is  a  system  for  classifying,  describing,  and 
querying  reusable  assets  [32].  Reuse  of  software  assets  at 
all  phases  of  the  software  engineering  life-cycle  is 
recognized  as  being  one  of  the  major  enablers  for 
productivity  and  quality  improvements.  However,  a 
common  inhibitor  to  company-wide  reuse  is  often  the  lack 
of  visibility  of  reusable  assets  within  the  developer 
community. 

A  central  repository  for  reusable  assets  provides  a 
solution  to  this  problem.  The  main  purpose  of  such 
repository  is  to  provide  mechanisms  for  classification  and 
storage  of  software  assets,  along  with  techniques  for 
efficiently  retrieving  them. 

SALMS  (Software  Asset  Library  Management 
System)  is  a  software  product  which  provide  these 
mechanisms.  It  fills  the  gap  between  development  for- 
reuse  activities  (building,  acquiring,  or  re-engineering  of 
reusable  assets)  and  the  development  with-reuse  activities 
(using  reusable  assets  in  the  creation  of  new  software 
products).  It  plays  a  central  role  in  the  implementation  of  a 
company’s  reuse  program. 

In  addition,  SALMS  also  provides  features  for  the 
requirement  management  activity,  and  for  the  creation  and 
management  of  a  company's  technical  library.  SALMS  can 
be  distributed  over  customer’s  network  of  PCs  or  UNIX 
workstations  and  thus  be  accessible  by  all  developers 
within  a  software  organization.  The  user  interface  is  based 
on  WEB  Technology. 

In  SALMS,  an  asset  can  be  viewed  as  a  collection  of 
artifacts  produced  throughout  the  life-cycle,  such  as 
requirements,  architecture  models,  design  specifications, 
source  code,  or  test  scripts. 


•  Automated  Software  Reuse  Repository  (ASRR) 

The  Automated  Software  Reuse  Repository  (ASRR) 
tool  provides  users  with  a  searchable  repository  of  reuse 
information  [33].  It  consists  of  two  main  parts,  the 
administration  tool  and  the  reuse  repository.  The 
administration  portion  of  the  tool  performs  user 
administrative  functionality  including:  the  ability  to  add, 
delete,  or  change  users  and  their  attributes.  The  attributes 
include  the  following:  security  levels,  group  and  security 
permissions  to  add,  edit  and  delete  modules.  The  reuse 
repository  allows  the  user  to  upload  modules  and  store 
them  in  a  searchable  repository. 

The  ASRR  provides  the  following  functions: 

Program  Control.  Provides  complete  login  control  for 
the  ASRR. 

Protection.  The  ASRR  can  limit  a  user’s  edit,  delete, 
viewing,  add,  upload  and  download  module 
permissions  through  the  administration  portion  of  the 
tool. 

Security.  The  ASRR  tool  provides  extra  security  for 
inactive  users  by  logging  them  out  of  the  ASRR  after 
a  30-minute  period  of  inactivity. 

Easy  Access  to  Reuse  Items.  The  ASRR  tool  allows 
registered  users  flexibility  in  searching  for  reuse  items 
in  the  reuse  repository  by  allowing  the  users  to  search 
for  strings  of  words  using  “not”,  “or”,  or  “and”  in 
searching. 

Reuse  Information  Readily  Available  for  Users. 
Specific  information  is  available  for  reuse  module 
items  including  the  platforms  utilized,  ease  of  reuse 
and  any  additional  information  obtained  from  users. 

•  The  Universal  Repository 

The  Universal  Repository  was  developed  by  Unisys 
[34].  It  is  designed  to  help  customers  move  forward  into  a 
repository-based  development  environment. 

The  Universal  Repository,  which  is  based  on  object- 
oriented  principles,  can  function  as  the  backbone  of  a 
flexible  workgroup  or  enterprise  development 
environment.  At  the  core  of  this  repository  is  the 
Repository  Services  Model  (RSM)  -  which  can  encompass 
representations  of  all  tools,  database  management  systems 
(DBMSs),  programming  languages,  business  rules,  and 
data. 

Customers  can  extend  the  Universal  Repository  by 
adding  their  own  models  based  on  the  structures  provided 
in  the  RSM.  The  summation  of  all  models  defined  in  a 
repository  is  called  the  information  model.  Each  part  of 
customers’  development  environment  becomes  an 
integrated  piece  of  the  whole  when  customers  use  the 
models  encompassed  within  the  information  model.  This 
unified  view  enables  both  developers  and  customers  to 
achieve  inter-tool  integration. 
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In  addition  to  its  modeling  capabilities,  the  Universal 
Repository  offers  features  that  enhance  customers’ 
development  environment,  manage  organizational 
information,  and  make  such  information  available  to 
everyone  in  a  customers’  organization. 

Unisys  is  dedicated  to  improving  customers’  product 
lines  with  the  Universal  Repository.  Support  and  training 
are  available  to  help  customers  quickly  adopt  this  new 
technology.  By  providing  a  shared  catalog  of  all  software 
components,  a  repository  promotes  reuse.  It  makes  it  easy 
to  locate  and  access  components  for  reuse  in  multiple 
applications.  Reusing  software  components  can  enhance 
quality.  Customers  can  develop,  validate,  and  verify  a 
component  for  use  in  one  product.  When  customers  reuse 
that  component,  they  expend  less  time  and  fewer  resources 
to  validate  and  verify  that  component  for  use  in  other 
products  [11].  A  single  change  to  correct  a  defect  in  a 
reused  component  is  reflected  in  all  tools  using  that 
component.  Such  consistency  among  products  ensures 
their  integration  and  interoperability  when  you  port  them 
to  different  operating  platforms. 

•  AIRS 

AIRS  is  an  Al-based  library  system  for  software 
reuse,  which  was  developed  by  E.J.  Ostertag,  J.A.  Hendler, 
R.  Prieto-Diaz,  C.  Braun  [12].  AIRS  allows  a  developer  to 
browse  a  software  library  in  search  of  components  that 
best  meet  some  stated  requirement.  A  component  is 
described  by  a  set  of  (feature.term)  pairs.  A  feature 
represents  a  classification  criterion,  and  is  defined  by  a  set 
of  related  terms  [10,  12].  AIRS  also  allows  representation 
of  packages,  that  is,  logical  units  that  group  a  set  of  related 
components.  As  with  components,  packages  are  described 
in  terms  of  features.  Unlike  components,  a  package 
description  includes  a  set  of  member  components. 
Candidate  reuse  components  (and  packages)  are  selected 
from  the  library  based  on  the  degree  of  similarity  between 
their  descriptions  and  a  given  target  description  [13], 
Similarity  is  quantified  by  a  non-negative  magnitude 
(called  distance)  that  represents  the  expected  effort 
required  to  obtain  the  target  given  a  candidate.  Distances 
are  computed  by  functions  called  comparators.  Three  such 
functions  are  presented:  subsumption,  closeness,  and 
package  comparators.  The  AIRS  classification  approach  is 
based  on  a  formalization  of  the  concepts  and  is  similar  to 
faceted  classification  [44],  The  functionality  of  a  prototype 
implementation  of  the  AIRS  system  is  illustrated  by 
application  to  two  different  software  libraries:  a  set  of  Ada 
packages  for  data  structure  manipulation,  and  a  set  of  C 
components  for  use  in  Command,  Control,  and 
Information  Systems. 

•  Reuse  Library  Toolset  (RLT) 


EVB  Software  Engineering,  Inc.  announced  the 
commercial  release  of  the  Reuse  Library  Toolset  (RLT)  in 
1994  [35],  RLT  is  a  system  for  creating  and  managing 
collections  of  reusable  assets  independent  of  programming 
language,  design  method,  or  development  process.  To 
represent  all  life-cycle  assets  RLT  employs  the  Extended 
Faceted  Classification  System,  controlled  keyword, 
attribute  value  (frames),  and  asset  interdependencies. 

Experience  has  shown  that  the  cost  of  producing 
software  is  significantly  reduced  when  reuse  is  an  integral 
part  of  the  process.  RLT  supports  all  reuse  oriented  tasks, 
from  library  management  through  domain  analysis  to  asset 
search  and  retrieval.  With  its  intuitive  graphical  user 
interface,  RLT  is  easy  for  beginners  to  learn,  yet  provides 
powerful  functionality  for  advanced  users  with  complex 
needs. 

RLT  provides  reuse  and  library  metrics,  client-server 
architecture,  and  ability  to  exchange  library  information 
across  multiple  platforms  and  databases.  These  include: 
DEC  Alpha  OSF1,  HP/UX,  SGI,  SunOS,  Solaris, 
Informix,  Oracle,  and  Sybase.  Additional  platforms  have 
been  supported  in  1995  include:  Windows  3.1/NT  and 
OS/2. 

RLT's  open  architecture  allows  easy  integration  with 
existing  CASE  and  development  tools,  such  as  structure 
design  tools,  versioning  systems  and  configuration 
management  systems. 

•  HSTX  Reuse  Repository 

The  HSTX  Reuse  Repository  was  developed  by 
Hughes  STX  Corporation  [36].  The  mechanisms  are 
designed  so  users  can  search/browse  the  contents  of  the 
Reuse  Repository  for  what  they  need  and  submit 
contributions  to  the  Reuse  Repository  librarian  through 
WWW  pages. 

4.2.  Government  Repositories 

•  Defense  Software  Repository  System  (DSRS) 

The  DSRS  is  an  automated  repository  for  storing  and 
retrieving  Reusable  Software  Assets  (RSAs)  [14].  The 
DSRS  software  now  manages  inventories  of  reusable 
assets  at  seven  software  reuse  support  centers  (SRSCs). 
The  DSRS  serves  as  a  centra!  collection  point  for  quality 
RSAs,  and  facilitates  software  reuse  by  offering 
developers  the  opportunity  to  match  their  requirements 
with  existing  software  products. 

DSRS  accounts  are  available  for  Government 
employees  and  contractor  personnel  currently  supporting 
Government  projects.  The  Account  Request  Form  must  be 
approved  and  signed  by  the  requestor's  Government 
Project  Manager  prior  to  submission  to  the  SRP.  The 
Customer  Assistance  Office  (CAO)  is  the  SRP  point  of 
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contact  for  both  technical  and  non-technical  information 
and  support. 

The  Defense  Software  Repository  System  (DSRS) 
supports  reusable  asset  classification  to  comply  with 
published  guidance  (DoD  8020. 1-M  and  TAFIM),  support 
domain  engineering,  establish  more  effective  asset 
searching,  and  increase  interoperability.  The  DoD  software 
community  is  trying  to  change  its  software  engineering 
model  from  its  current  software  cycle  to  a  process-driven, 
domain-specific,  architecture-based,  repository-assisted 
way  of  constructing  software  [15].  In  this  changing 
environment,  the  DSRS  has  the  highest  potential  to 
become  the  DoD  standard  reuse  repository  because  it  is  the 
only  existing  deployed,  operational  repository  with 
multiple  interoperable  locations  across  DoD.  Seven  DSRS 
locations  support  nearly  1,000  users  and  list  nearly  9,000 
reusable  assets.  The  DISA  DSRS  alone  lists  3,880  reusable 
assets  and  has  400  user  accounts. 

DSRS  is  adaptable  to  additional  types  of  reusable 
assets  and  better  methods  of  describing  them.  The 
description  of  repository  assets  is  called  classification. 
This  paper  reports  the  results  and  recommendations  of  a 
study  of  classification  methods  for  storage  and  retrieval  of 
Reusable  Assets  (RAS)  in  the  DSRS.  The  Defense 
Software  Repository  System  (DSRS)  reusable  asset 
classification  is  changing  to  achieve  policy  compliance, 
support  domain  engineering,  establish  more  effective  asset 
searching,  and  increase  interoperability. 

The  far-term  strategy  of  the  DSRS  is  to  support  a 
virtual  repository.  These  interconnected  repositories  will 
provide  the  ability  to  locate  and  share  reusable  components 
across  domains  and  among  the  services.  An  effective  and 
evolving  DSRS  is  a  central  requirement  to  the  success  of 
the  DoD  software  reuse  initiative.  Evolving  DoD 
repository  requirements  demand  that  DISA  continue  to 
have  an  operational  DSRS  site  to  support  testing  in  an 
actual  repository  operation  and  to  support  DoD  users.  The 
classification  process  for  the  DSRS  is  a  basic  technology 
for  providing  customer  support  [16].  This  process  is  the 
first  step  in  making  reusable  assets  available  for 
implementing  the  functional  and  technical  migration 
strategies. 

•  Library  Interoperability  Demonstration  (LID) 

The  Library  Interoperability  Demonstration  (LID)  is  a 
prototype  library  system  [17,  18].  It  is  used  to  illustrate 
how  monolithic  reuse  libraries  can  be  decomposed  into 
distinct,  functional  layers  connected  by  open  interfaces, 
such  as  those  specified  by  Asset  Library  Open 
Architecture  Framework  (ALOAF).  It  is  a  collaboration 
between  SAIC  and  Unisys.  The  demonstration  shows  how 
the  physical  storage  of  assets  can  be  separated  from  the 
cataloging  of  assets,  and  how  a  user  can  choose  a  single, 
local,  user  interface  tool  to  access  multiple  reuse  libraries. 


The  STARS  Program  developed  a  specification  of  an 
ALOAF  to  support  an  "open  systems"  approach  to 
constructing  asset  libraries.  The  ALOAF  evolved  to 
incorporate  interfaces  specifically  intended  for 
interoperability,  culminating  in  the  release  of  ALOAF 
Version  1.2  [19].  The  LID  builds  upon  the  open  interfaces 
provided  by  ALOAF,  its  associated  Asset  Interchange 
Language  (AIL),  PCTE,  OSF/Motif,  and  POSIX.  As 
shown  in  the  LID  Software  Architecture  diagram,  a  reuse 
library  can  be  divided  into  three  distinct  layers  which  are 
connected  via  open  interfaces,  thus  providing  opportunities 
for  interoperability  at  each  layer.  The  three  layers  are: 

User  Interfaces.  The  demonstration  includes  two  user 
interface  tools:  a  graphical  browser  derived  from  the 
Unisys  Reuse  Library  Framework  (RLF)  and  a  text- 
based  browser  modeled  after  SPS's  InQuisiX  reuse 
library  system.  Both  tools  are  built  upon  Ada  bindings 
to  OSF/Motif. 

Asset  Catalogs.  The  demonstration  shows  two  asset 
catalogs.  The  first  catalog  is  derived  from  the  Unisys 
collection  of  ASW  components,  and  resides  on  an 
IBM  RISC  System/6000  at  the  STARS  Technology 
Center.  The  second  catalog  is  derived  from  SAIC’s 
collection  of  flight  simulator  components,  and  resides 
on  an  IBM  RISC  System/6000  at  the  SAIC  offices  in 
Orlando,  FL.  The  interface  between  each  of  the 
catalogs  and  the  user  interface  tools  is  defined  by  the 
ALOAF. 

Asset  Storage.  In  the  demonstration,  the  storage  of 
assets  is  provided  by  the  AFS  cell  at  the  STARS 
Technology  Center.  Neither  catalog  stores  assets 
itself;  instead,  both  catalogs  "subcontract"  the  storage 
function  to  the  AFS  server. 

•  Integrated  -  Computer  Aided  Software 
Engineering  (I-CASE) 

I-CASE  was  developed  by  Air  Force  Reuse  Center 
(AFRC)  [38].  The  Air  Force  Reuse  Center  is  the  Air  Force 
Management  Information  Systems  (MIS)  repository  for 
reusable  software  assets.  These  assets  are  primarily  Ada 
source  code  modules  consisting  of  Government  and 
commercial  packages.  The  library  has  over  1,200  assets 
including  many  assets  of  the  system  life-cycle,  such  as 
requirements,  designs,  documentation  and  source  code. 
Integrated  Computer-Aided  Software  Engineering  (I- 
CASE)  provides  a  contract  for  DoD  users  to  purchase  an 
integrated  set  of  tools  that  will  automate  many  of  the  MIS 
software  development  activities  over  the  entire  software 
development  and  maintenance  life-cycle.  I-CASE  also 
provides  the  support  elements  necessary  to  implement, 
operate,  and  maintain  the  I-CASE  environment  (i.e., 
training,  maintenance,  and  technical  support).  The  overall 
strategy  of  this  project  is  to  automate  reuse  processes 
through  an  Integrated-Computer  Aided  Software 
Engineering  (I-CASE)  environment.  The  specific  strategy 
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is  to  implement  these  reuse  processes  within  a  workflow 
environment  to  certify  or  re-engineer  reusable  assets  as 
quickly  as  possible.  The  EVB  Reuse  Library  Tool  (RLT), 
supplied  as  part  of  I-CASE,  is  used  as  the  reuse  repository 
tool. 

•  Multimedia  Oriented  Repository  Environment 

(MORE) 

As  the  World  Wide  Web  (WWW)  becomes  very 
popular  among  internet  users,  an  increasing  number  of 
public  repositories  are  using  the  WWW  to  promote  their 
services.  The  Electronic  Library  Services  and  Applications 
(ELSA)  project  is  the  operational  part  of  the  Repository 
Based  Software  Engineering  (RBSE)  program  [20].  RBSE 
is  a  National  Aeronautics  and  Space  Administration 
(NASA)  sponsored  program  dedicated  to  introducing  and 
supporting  common,  effective  approaches  to  designing, 
building,  and  maintaining  software  systems  by  using 
existing  software  assets  stored  in  a  specialized  library  or 
repository. 

In  addition  to  operating  a  software  lifecycle 
repository,  RBSE  promotes  software  engineering 
technology  transfer,  academic  and  instructional  support  for 
reuse  programs,  the  use  of  common  software  engineering 
standards  and  practices,  software  reuse  technology 
research,  and  interoperability  between  reuse 
libraries/repositories. 

During  its  life  cycle,  the  ELSA  project  responded  to 
emerging  technologies,  the  growing  sophistication  of  its 
client  base,  and  industry  trends  by  advancing  the 
capabilities  of  its  management  software.  Thus,  ELSA 
stands  as  a  customer-driven  environment  employing  an 
advanced  library  management  mechanism,  MORE 
(Multimedia  Oriented  Repository  Environment). 

ELSA  replaced  AdaNet  on  August  31,  1994  when  the 
first  public  access  to  its  new  service  was  granted.  The 
library  is  the  operational  part  of  the  Repository  Based 
Software  Engineering  (RBSE)  program  which  is  a  NASA 
sponsored  initiative  in  software  reuse.  In  a  timeframe  of 
approximately  two  weeks,  ELSA  transitioned  its  library 
holdings  and  accompanying  metadata  from  a  monolithic 
X-Windows  based  system  to  MORE.  The  improved 
interface  employs  client/server  technology  and  is 
accessible  through  the  WWW.  MORE  is  a  public  domain, 
metadata  based  repository  tool  employing  the  WWW  as  its 
sole  user  interface.  It  consists  of  a  set  of  application 
programs  w'hich  operate  together  with  a  stock  httpd  server 
to  provide  access  to  a  database  of  metadata  [21].  The 
entire  interface,  client  browsing  and  searching,  repository 
definition,  data  entry  and  other  administrative  functions, 
are  provided  through  stock  Web  clients. 

Repository  assets  are  classified  using  a  collection 
(topic)  and  class  (type)  paradigm.  According  to  their 
subject  matter,  they  are  included  in  the  collections  or 
subordinate  collections  that  best  represent  domain 


coverage.  The  assets  are  also  classified  by  media  or 
information  type  through  the  class  approach.  Thus,  users 
can  view  the  information  from  a  top-down  perspective 
through  the  hierarchy  of  collections  or  across  collections 
by  the  hierarchy  of  classes. 

MORE  was  designed  to  support  this  collection  and 
class  model.  Navigation  is  achieved  through  the  activation 
of  high-level  hypertext  links  which  ultimately  lead  to 
metadata  or  assets  themselves.  Searching  (Natural 
Language  or  Pattern  Match)  is  performed  against 
information  provided  in  the  metadata  [22,  23,  24].  This 
combination  provides  users  with  a  reliable  and  efficient 
means  of  accessing  a  high  volume  of  assets. 

Administrative  functions  are  specifically  designed  to 
meet  librarians'  needs.  For  instance,  assets  are  stored  in 
"developmental"  mode  w'hich  provides  a  cleanroom 
environment  for  the  performance  of  population  and/or 
certification  activities.  Developmental  assets  are  only 
available  for  viewing  by  librarians.  Following  the 
completion  of  these  processes,  each  asset  is  promoted  to 
"production"  mode  and  is  therefore  accessible  to  the 
general  user  population. 

Each  collection  can  have  one  or  more  groups 
associated  with  it  that  are  authorized  to  access  the  assets 
and  subcollections  making  up  the  collection.  Groups  in 
turn  are  made  up  of  sets  of  users  and  other  groups;  all 
defined  through  the  librarian  interface.  Users  not 
transitively  a  member  of  a  designated  group  for  a  given 
collection  will  never  see  the  collection,  or  its  contents, 
through  any  of  the  browser  or  search  mechanisms.  This 
mechanism  supports  the  definition  of  multiple  virtual 
repositories  in  a  single  physical  repository,  reducing 
administrative  overhead  and  allowing  direct  sharing  of 
assets. 

•  Asset  Source  for  Software  Engineering  Technology 

(SAIC/ASSET) 

Asset  Source  for  Software  Engineering  Technology 
(SAIC/ASSET)  offers  products  and  services  in  digital 
library  support,  electronic  commerce  and  software 
engineering  with  an  emphasis  on  reengineering  and  reuse 
[26].  SAIC/ASSET,  established  by  Advanced  Research 
Projects  Agency  (ARP A)  as  a  subtask  under  the  Software 
Technology  for  Reliable  Systems  (STARS)  program,  is 
transitioning  to  a  private  enterprise  as  a  division  of  Science 
Applications  International  Corporation  (SAIC). 

SAIC/ASSET's  primary  mission  is  to  provide  a 
distributed  support  system  for  software  reuse  with  the 
Department  of  Defense  (DoD)  and  to  help  foster  a 
software  reuse  industry  within  the  United  States. 
SAIC/ASSET's  initial  and  current  focus  is  on  software 
development  tools,  reusable  components  and  documents 
on  software  development  methods.  SAIC/ASSET  is 
participating  in  interoperation  with  other  reuse  libraries 
such  as: 
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•  Comprehensive  Approach  for  Reusable  Defense 
Software  (CARDS) 

•  Ada  and  Software  Reuse  Information  Clearinghouse 
Defense  Software  Repository  System  (DSRS) 

•  Electronic  Library  Services  &  Applications  Lobby 
(ELSA) 

The  goals  SAIC/ASSET  are  pursuing  involve: 

•  Creating  a  focal  point  for  software  reuse  information 
exchange 

•  Advancing  the  technology  of  software  reuse 

•  Providing  an  electronic  marketplace  for  reusable 
software  products  to  the  evolving  national  software 
reuse  industry. 

To  achieve  these  goals,  SAIC/ASSET  operates  the 
Worldwide  Software  Reuse  Discovery  (WSRD)  Library. 
The  WSRD  Library  is  populated  with  quality  reusable 
software  components  which  can  be  distributed  to  its 
subscribers.  WSRD  contains  over  700  assets  available  to 
over  1500  users  throughout  the  world.  The  library 
specializes  in  software  lifecycle  artifacts  and  documents 
written  specifically  to  promote  software  reuse  and 
development.  SAIC/ASSET  users  have  access  to  other 
components  stored  in  the  CARDS  and  DSRS  reuse 
libraries.  Through  the  WSRD,  users  can  search,  browse 
and  download  asset  catalogs  in  over  30  domains. 
SAIC/ASSET's  World  Wide  Web  pages,  located  at 
http://source.asset.com/,  describe  products  and  services 
offered  through  SAIC/ASSET,  as  well  as  information 
related  to  software  reuse. 

•  The  Public  Ada  Library  (PAL) 

Since  1984,  the  Ada  Software  Repository  (ASR)  has 
been  a  major,  publicly  available  source  of  Ada  code.  Now 
called  the  Public  Ada  Library  (PAL)  [27],  it  provides  more 
than  100  megabytes  of  programs,  components,  tools, 
general  information,  and  educational  materials  on  Ada.  It 
also  contains  materials  on  the  Very  High  Speed  Integrated 
Circuit  (VHSIC)  Hardware  Description  Language 
(VHDL),  which  is  based  on  Ada. 

For  those  with  access  to  the  Internet,  the  PAL  can  be 
accessed  via  the  File  Transfer  Protocol  (FTP).  The  PAL  is 
located  on  the  wuarchive.wustl.edu  host,  and  on  mirror 
sites  at  ftpxnam.fr  and  ftp.cdrom.com.  Also,  the  PAL  can 
be  obtained  on  disk,  tape,  and  compact-disk  read-only 
memory  (CD-ROM). 

Additionally,  the  PAL  can  be  accessed  by  means  of 
such  Internet  services  as:  the  Network  File  System  (NFS), 
which  allows  computers  to  share  files  across  a  network; 
archive,  a  system  of  querying  anonymous-FTP  sites;  and 
gopher,  via  gopher  servers  wuarchive.wustl.edu  and 
gopher.wustl.edu. 

•  CAPS  Software  Reusable  Component  Repository 


CAPS  (Computer  Aided  Prototyping  System)  is  a 
research  project  developed  by  the  Software  Engineering 
Group  led  by  Prof.  Luqi  at  Naval  Postgraduate  School 
[39].  Initial  implementation  of  CAPS  software  base  was 
first  explored  in  1988  [40].  An  implementation  of  the 
software  base  was  accomplished  in  1991  by  using 
ONTOS,  an  object  oriented  data  base  management  system 
that  provides  an  interface  to  C++  for  customization  and 
flexibility  [41].  The  CAPS  software  base  is  being  changed 
to  a  software  component  repository  since  1998  [1].  The 
CAPS  component  repository  supports  two  critical 
functions,  component  storage  and  component  retrieval. 
Much  effort  has  been  made  to  improve  the  component 
retrieval  method  [42,  43].  To  the  best  of  our  knowledge, 
CAPS  Repository  is  the  only  one  that  supports  profile 
matching  and  signature  matching.  It  provides  high 
precision  and  recall  retrieval  method  at  same  time  [1].  The 
CAPS  repository  is  still  under  construction.  A  prototype 
has  been  developed  to  verify  the  performance  of  the 
retrieval  methods  [1]. 

•  The  Ada  Library  and  the  Reuse  Library  at  the 

Defense  Information  Systems  Agency  (DISA) 

The  Ada  Library  and  the  Reuse  Library  at  the  Defense 
Information  Systems  Agency  (DISA)  are  public,  non¬ 
lending,  reference  libraries  for  ail  professionals,  students, 
and  researchers  seeking  information  on  the  Ada 
programming  language  and  on  software  reuse  [37].  The 
number  of  books  and  articles  on  Ada  and  on  reuse  grows 
daily.  Also,  there  is  a  wealth  of  information  available  on 
the  Internet  and  on  the  World  Wide  Web.  Putting  the  Net 
together  with  the  Ada  and  Reuse  Libraries  makes  a  very 
powerful  research  tool. 

Both  Libraries  collect  and  hold  information  found  in 
documents,  books,  conference  proceedings,  newspaper  and 
journal  articles,  and  other  multimedia  material. 

The  Libraries  can  provide  assistance  in  two  ways: 
helping  users  find  publications  in  each  library,  and 
conducting  on-line  searches  for  published  information 
available  elsewhere.  Users  can  access  these  resources  in 
person,  and  via  the  Web,  or  they  can  call  DISA  to  request 
a  search. 

Over  the  Web,  go  to  http://sw-eng.falls-church.va.us. 
There,  click  on  "Library”  at  the  main  page  of  either  the 
AdalC  or  the  ReuselC.  Users  can  search  database  by  title, 
author,  subject,  or  publisher. 

5.  Comparison 

Commercial  reusable  component  repositories  usually 
are  integrated  into  a  CASE  environment  [28,  29]. 
Currently,  some  major  repositories  (ASSET,  PAL,  and 
DSRS)  begin  to  use  web-based  techniques  to  provide 
services.  They  are  utilizing  flat  files  written  in  HyperText 
Markup  Language  (HTML).  Electronic  Library  Services 
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and  Applications  (ELSA)  has  gone  a  step  further  by  using 
the  Multimedia  Oriented  Repository  Environment 
(MORE). 


Following  are  some  comparison  results  for  all  the 
repositories  listed  above. 


Features 

+1  Reuse  Repository 

Web- 

Based 

Integrated  into 
CASE  Environment 
v 

Security 

Control 

Retrieval  Methods 

SALMS 

ASRR 

Y 

i 

Y 

Y 

Y 

Browsing 

Keywords 

The  Universal  Repository 

Y 

Y 

Y 

Keywords 

Browsing  and  Keywords 

AIRS 

Y 

Facets  Approach  | 

RLT 

. 

Y 

Keywords 

HSTX  Reuse  Repository 

Y 

Y 

Keywords 

DSRS 

Y 

Y 

Keywords 

LID 

Y 

Keywords 

I-CASE 

Y 

Y 

Keywords 

MORE 

Y 

Y 

Keywords 

ASSET 

Y 

Y 

Y 

Keywords 

PAL 

Y 

Y 

Keywords 

CAPS 

Y 

Browsing,  Keywords,  Profile  &  Signature  Matching 

Ada  Library  and  Reuse  Library  (DISA) 

Y 

Y 

Browsing  and  Keywords 

6.  Conclusion 

Web-based  reuse  is  the  trend  of  software  component 
repositories  supported  by  the  government.  To  be  a  part  of 
an  integrated  CASE  environment  is  the  trend  of 
commercial  software  component  repositories.  Usually,  the 
aim  of  the  first  one  is  to  provide  a  service  within  a  domain, 
organization,  or  area,  such  as  ASSET  for  DoD,  DSRS  for 
DISA  etc.  This  kind  of  repository  is  used  in  a  wide  scope. 
The  aim  of  the  second  is  to  provide  an  integrated  CASE 
environment  for  a  software  development  organization.  So, 
this  kind  of  repository  is  generally’  a  part  of  CASE 
environment  and  is  used  in  a  relatively  narrow  scope. 

The  long-term, goal  of  the  CAPS  project  [1]  is  to 
provide  a  distributed  software  component  repository  to 
support  the  development  of  prototype  systems  through 
intranet  technology.  So,  it  will  combine  the  advantages  of 
commercial  component  repositories  and  government 
supported  repositories.  This  developing  research  system  is 
an  example  of  future  software  repositories. 
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■JM137-MA  and  40473-MA,  and  in  part  by  DARPA  under  contract  #99-F759. 


2.  The  Problem 


As  the  range  and  complexity  of  computer  applications  have  grown,  the  cost  of  software 
development  has  become  the  major  expense  of  computer-based  systems  [Boehm  1981], 
[Karolak  1996].  Research  shows  that  in  private  industry  as  well  as  in  government  environments, 
schedule  and  cost  overruns  are  tragically  common  [Luqi  1989,  Jones  1994,  Boehm  1981]. 
Despite  improvements  in  tools  and  methodologies,  there  is  little  evidence  of  success  in 
improving  the  process  of  moving  from  the  concept  to  the  product,  and  little  progress  has  been 
made  in  managing  software  development  projects  [Hall,  1997].  Research  shows  that  45  percent 
of  all  the  causes  for  delayed  software  deliveries  are  related  to  organizational  issues 
[vanGenuchten  1991].  A  study  published  by  the  Standish  Group  reveals  that  the  number  of 
software  projects  that  fail  has  dropped  from  40%  in  1997  to  26%  in  1999.  However,  the 
percentage  of  projects  with  cost  and  schedule  overruns  rose  from  33%  in  1997  to  46%  in  1999 
[Reel  1999]. 

Despite  the  recent  improvements  introduced  in  software  processes  and  automated  tools,  risk 
assessment  for  software  projects  remains  an  unstructured  problem  dependent  on  human 
expertise  [Boehm  1988,  Hall  1997].  The  acquisition  and  development  communities,  both 
governmental  and  industrial,  lack  systematic  ways  of  identifying,  communicating  and  resolving 
technical  uncertainty  [SEI  1996]. 


This  paper  explores  ways  to  transform  risk  assessment  into  a  structured  problem  with 
systematic  solutions.  Constructing  a  model  to  assess  risk  based  on  objectively  measurable 
parameters  that  can  be  automatically  collected  and  analyzed  is  necessary.  Solving  the  risk 
assessment  problem  with  indicators  measured  in  the  early  phases  would  constitute  a  great 
benefit  to  software  engineering.  In  these  early  phases,  changes  can  be  made  with  the  least 
impact  on  the  budget  and  schedule.  The  requirements  phase  is  the  crucial  stage  to  assess  risk 
because:  a)  it  involves  a  huge  amount  of  human  interaction  and  communication  that  can  be 
misunderstood  and  can  be  a  source  of  errors;  b)  errors  introduced  at  this  phase  are  very 
expensive  to  correct  if  they  are  discovered  late;  c)  the  existence  of  software  generation  tools  can 
diminish  the  errors  in  the  development  process  if  the  requirements  are  correct;  and  d) 
requirements  evolve  introducing  changes  and  maintenance  along  the  whole  life  cycle. 

Part  of  the  problem  is  misinterpreting  the  importance  of  risk  management.  It  is  usually  and 
incorrectly  viewed  as  an  additional  activity  layered  on  the  assigned  work,  or  worse,  as  an 
outside  activity  that  is  not  part  of  the  software  process  [Hall  1997,  Karolak  1996].  One  of  the 
goals  of  our  research  is  to  integrate  a  risk  assessment  model  with  previous  research  on  CAPS2  at 
NPS  [Ham  99].  This  integration  is  required  in  order  to  capture  metrics  automatically  in  the 
context  of  a  modem  evolutionary  prototyping  and  software  development  process.  This  should 
provide  project  managers  with  a  more  complete  tool  that  can  enable  improved  risk  assessment 
without  interfering  with  the  work  of  a  project  s  software  engineers. 

c 

A  second  source  of  problems  in  risk  management  is  the  lack  of  tools  [Karolak  1996].  The 
main  reason  for  this  lack  of  tools  is  that  risk,  assessment  is  apparently  an  unstructured  problem. 
To  systematize  unstructured  problems  it  is  necessary  to  define  structured  processes.  Structured 
processes  involve  routine  and  repetitive  problems  for  which  a  standard  solution  exists. 
Unstructured  processes  require  decision-making  based  on  a  three-phase  method  (intelligence, 
design,  choice)  [Turban  et  al  1998].  An  unstructured  problem  is  one  in  which  none  of  the  three 
phases  is  structured.  Current  approaches  to  risk  management  are  highly  sensitive  to  managers 
perceptions  and  preferences,  which  are  difficult  to  represent  by  an  algorithm.  Depending  on  the 
decision-maker’s  attitude  towards  risk,  he  or  she  can  decide  early  with  little  information,  or  can 
postpone  the  decision,  gaining  time  to  obtain  more  information,  but  losing  some  control. 

A  third  source  of  risk  management  problems  is  the  confusion  created  by  the  informal  use  of 
terms.  Often,  the  software  engineering  community  (and  most  parts  of  the  project  management 


2  CAPS  stands  for  Computer  Aided  Prototyping  System  [Luqi  1988]. 


community  [Wideman  19921)  uses  the  term  "rid-’-  „„  ,,  T,  - 

different  concepts.  It  is  erroneously  user!  pq  sua  y.  This  term  is  often  used  to  describe 

Hall  1997,  Karolak,  1996],  Generally  software^s^  ’U"Certainty"  and  ”threat"  tSEI  1996, 
an  unsatisfactory  outcome  and  a  loss  afflt  t  ^  35  3  meaSUre  °f  the  llke]ihood  of 

project,  process!  and  pmduc, [H“ s£  mTl  Z' *  ^  ^  °f 

risk  Jd  pS  s 

is  a  si, oat, on  in  which  the  probabiiity  distfibS^^S 

of  s» “r.hf,eniS  -tear  ‘°  probabi“c  -“on*  of.  succession 

define  nsk  to  be  tL  * ZV *  ^  “  '“T**.  ,he  d“^  <h“*  “n  occur.  We 

outcome  could  be  either  positive  (gain)  oTn^KllZt)  T bPr°b!'b',“y  0f‘ occ“'rence.  This 
address  not  only  the  classical  risk  ®  e.  ^oss^  T^IS  abstraction  permits  one  to 

to  competitive  advaLge  '“anagemen,  tssue,  bn,  also  to  discover  opportuntties  leading 

possWe  a^fL^b2^K'ir“S(.,he  probabili»'  distrib“ti“”  the 

in  the  process.  The  mafic!  ™ 

important  threats  and  a  statistical  analvsis  to  rh^  °n  &  ,causal  atlaiysrs  to  identify  the  most 
relate  its  parameters  to  readily  measurable  metrics05'  '  '  S  ape  of  Ihe  probability  distribution  and 

3.  Related  Work 

There  are  three  main  groups  of  research  related  to  risk: 

*  approach^  .™S  S™P  **"»  a  P'Obabtlistic 

Schneidewind  1975  Musa  19981  Howeve  tir  rehab'hty  of  the  Product  [Lyu  1995, 
product,  no,  the  risk TfS  T  '  ,h,S  approach  addresses  the  reliability  of  the 

constraints  Ttee  aporoacS  »  V ”1'*  th'  project  wi,hi"  b“dget  aad  schedule 
projects  which  are  omsidem!  M  ^  f  10  assess  risks  relaKd  »  failures  of  software 
is  that  the  resulting  as!!«h  ,  P'  °f  the  CUrrent  papep  A  concern  with  these  approaches 

‘°°  ‘f  correct  possible  faults, 

gone  a,  the  time  when  reliability  of  the  ”  m°S"y 

with  the  deretopmem process" Howeter  rtf'55  'h'  fr°m  <he  beginning-  in  Parallel 
subjective  and  weakly  structured  Basic^the^™*^6.!  ^  1655  rigorous’  ^Pically 
checklists  TSEI  1996  Hall  1007  ru  sical’>  these  approaches  use  lists  of  practices  and 

1996]  Pamdoxicar'sFI  d!^  ’  'T"  ‘"7-  J°"eS  19941  “  scoring  techniques  [Karolak 
severi  J  of “ChniCa‘  *k  “  2  "*asure  of  the  probability  and 

and  performance  requirements  fSEI  Tqq^u  ^  ^  ^  n0t  meet  ltS  mtended  functions 
in  thfs  ease  hecaS *”  ''Pr°babi,ity"  iS  mISl'ad'”S 

models  to  atlfSow  !!!Lv :  A  gr°“P  °f  reSearchers  “ses  well  known  estimation 
[Boehm  19811  and  <tr  riu  rp  3  pr°je.Cnt  cou  d  be-  The  widely  used  methods  COCOMO 

unchanged,  and  require  an  isttaattan  rfth!  ^ofth^f,  ‘T  'hH'  req“iremen,s  wH1  remain 
[Londeix  1987],  This  size  cannot  be  actnaliy  ,h'  m°d',S 

do  no^onsider'coorditnrdon^nd^orn  *"  °f  Pr0jeC,S'  i"Cl“ding  PERT’  CPM'  aad  G-«t. 
interdependencL“h  '“I  m°d'1S  rePr'Sem  SeqU'"',al 

This  simplified  vision  of  a  project  cannot  add^s  th!"!'  r  ,°"3hip’  b''7en  activities 

requirements  of  infomtation  in  concurrentnactivhfesreexrenr'  7  crea,ed  bf  reciprocal 

concurrent  activities,  exception  management,  and  the  impact  of 
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actor  interactions.  Since  the  missing  factors  increase  time  requirements,  the  estimates  resulting 
from  these  generic  project  estimation  models  are  overly  optimistic. 

These  issues  are  addressed  by  Vit  Project  [Levitt  1999,  Thomsen  et  al.  1999].  Vit  Project  is 
applicable  to  projects  in  which  a)  all  activities  in  the  project  can  be  predefined;  b)  the 
organization  is  static,  and  all  activities  are  pre-assigned  to  actors  in  the  static  organization;  c)  the 
exceptions  to  activities  result  in  extra  work  volume  for  the  predefined  activities  and  are  carried 
out  by  the  pre-assigned  actors;  and  d)  actors  are  assumed  to  have  congruent  goals.  The  model  is 
well  suited  for  simulating  organizations  that  deal  with  great  amounts  of  information  processing 
and  coordination.  Such  characteristics  are  extremely  relevant  in  software  processes  [Boehm, 
1981].  However,  this  approach  requires  a  fixed  work  breakdown  structure,  and  therefore  does 
not  apply  at  the  early  stages  when  requirements  are  changing  and  the  set  of  tasks  comprising  the 
project  are  still  uncertain. 

By  using  informal  risk  assessment  models,  using  estimation  models  based  on  optimistic 
assumptions  that  require  parameters  difficult  to  provide  until  late,  and  using  optimistic  project 
control  tools,  project  managers  condemn  themselves  to  overrun  schedules  and  cost. 


4.  The  Proposed  Project  Risk  Model 

Our  approach  is  based  on  metrics  automatically  collectable  from  the  engineering  database 
from  near  the  beginning  of  the  development.  The  indicators  used  are  Requirements  Volatility 
(RV),  Complexity  (CX),  and  Efficiency  (EF). 


Requirement  Volatility  (RV):  RV  is  a  measure  of  three  characteristics  of  the  requirements:  a)  the 
Birth-Rate  (BR),  that  is  the  percentage  of  new  requirements  incorporated  in  each  cycle  of  the 
evolution  process;  b)  the  Death-Rate  (DR),  that  is  the  percentage  of  requirements  dropped  in 
each  cycle;  and  c)  the  Change-Rate  (CR)  defined  as  the  percentage  of  requirements  changed 
from  the  previous  version.  A  change  in  one  requirement  is  modeled  as  a  birth  of  a  new 
requirement  and  the  death  of  another,  so  that  CR  is  included  in  the  measured  values  of  BR  and 
DR.  RV  is  calculated  as  follows:  RV  =  BR  +  DR. 

Complexity  (CX):  Complexity  of  the  requirements  is  measured  from  a  formal  specification.  A 
requirements  representation  that  supports  computer-aided  prototyping,  such  as  PSDL  [Luqi 
1996],  is  useful  in  the  context  of  evolutionary  prototyping.  We  define  a  complexity  metric 
called  Large  Granularity  Complexity  (LGC)  that  is  calculated  as  follows:  LGC  =  O  +  D  +  T, 
where  for  PSDL  O  is  the  number  of  atomic  operators  (functions  or  state  machines),  D  is  the 
number  of  atomic  data  streams  (data  connections  between  operators),  and  T  is  the  number  of 
abstract  data  types  required  for  the  system.  Operators  and  data  streams  are  the  components  of  a 
dataflow  graph.  This  is  a  measure  of  the  complexity  of  the  prototype  architecture,  similar  in 
spirit  to  function  points  but  more  suitable  for  modeling  embedded  and  real-time  systems.  The 
measure  can  also  be  applied  to  other  modeling  notations  that  represent  modules,  data 
connections,  and  abstract  data  types  or  classes.  We  found  a  strong  correlation  between  the 
complexity  measured  in  LGC  and  the  size  of  PSDL  specifications  (correlation  coefficient  R  = 
0.996).  Most  important,  we  also  found  a  strong  correlation  (R  =  0.898)  between  the  complexity 
measured  in  LGC  and  the  size  of  the  final  product  expressed  in  non-comment  lines  of  Ada  code, 
including  both  the  code  automatically  created  by  the  generator  and  the  code  manually 
introduced  by  the  programmers. 

Efficiency  (EF):  The  efficiency  of  the  organization  is  measured  using  a  direct  observation  of  the 
use  of  time.  EF  is  calculated  as  a  ratio  between  the  time  dedicated  to  direct  labor  and  the  idle 
time:  EF  =  Direct  Labor  Time  /  Idle  Time.  We  found  that  this  easily  measurable  quantity  was  a 
good  discriminator  between  high  team  productivity  and  low  team  productivity  in  a  set  of 
simulated  software  projects  [Nogueira  2000]. 
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Vir  p'1'6  V an,^  cabbrated  our  model  with  a  series  of  simulated  software  projects  usirw 
nr^°JeCf  H  1S  t0°  Wa t chosen  because  of  the  inclusion  of  communications  and  exceptions  in 
its  project  dynamics  model,  and  because  it  has  been  extensively  validated  for  many  types  of 
n§meerms  projects,  including  software  engineering  projects.  The  input  parameters  for  the 

Rh and  CX’  3nd  the  °bSerVed  0U*Ut  was  thedevelopment  time 
“  the  Pr°P°sed  mo]del  uses  parameters  collected  during  the  early  phases  and  given  that 
r  w  J  ?  requlfS  a  comP,ete  breakdown  structure  of  the  project,  which  can  be  done  only  in 
e  ate  phases,  there  was  a  considerable  time  gap  between  the  two  measurements.  This  time  gap 

less  than  for  a  post-mortem  analysis,  but  it  is  sufficient  for  model  calibration  and  validation 
purposes. 


nrnJJ  1>S1T  u°n  rcSU  tS  Were  anaIyzed  statistically,  with  the  finding  that  the  Weibull 
p  bability  distribution  was  the  best  fit  for  all  the  samples.  A  random  variable  x  is  said  to  have  a 

ei  u  f^nbution  Wlth  Parameters  a,  p  and  y  (with  a  >  0,  P  >  0)  if  the  probability  distribution 
function  (pdf)  and  cumulative  distribution  function  (cdf)  of  x  are  respectively: 


pdf:  f(x;  a,  p,  y)  =  >j 

(a/p“)  (x  -y)“-‘  exp(-((x  -  y)/p)“), 


*  x  <  y 

cdf:  F(x;  a,  p,  y)  =  { 

1  —  exp(-((x  y)  /  p)  “)  x  >  y. 

The  random  variable  under  study,  x.  can  be  interpreted  as  development  time  in  our  context 
The  shape  parameter  a  controls  the  skew  of  the  pdf,  which  is  not  symmetric.  We  found  that  this 
is  most  y  related  to  the  efficiency  of  the  organization  (EF).  The  scale  parameter  P  stretches  or 
compresses  the  graph  in  the  *  direction.  We  found  that  this  parameter  is  related  to  the  efficiency 
(EF),  requirements  volatility  (RV),  and  complexity  (CX)  measured  in  LGC.  The  shifting 

parameter  y  is  shifts  the  origin  of  the  curves  to  the  right.  We  found  that  it  is  mostly  related  to  the 
complexity  measured  in  LGC. 


Based  on  best  fit  to  our  simulation  results,  the  model  parameters  can  be  derived  from  the 
project  metrics  using  the  following  algorithm: 

If  (EF  >  2.0)  then  a  =  1.95; 

Y  =  22  *  0.32*  (13*ln(LGC)— 82)  ; 

P  =  Y/(5.71+(RV-20)*0.046); 

else  a  =  2.5; 

Y  =  22  *  0.85*  (13* In (LGC) — 82 )  ; 

P  =  y  / (5.47- (RV-20)*0. 114) ; 

end  if; 

oXforedme^3165  **  f°ll0Wing  cumu!ative  probability  distribution  for  project  completion  on 


P  (x)  =1  -  exp  ( -  ( ( (x  -  y)/p)“))  //  where  x  is  time  in  days 

This  equation  can  be  inverted  to  obtain  the  schedule  length  needed  to  have  a  probability  P  of 
completing  within  schedule,  with  the  following  result. 

x  =  y  +  p  (-ln(l-p)  )1/a 

The  probability  P  can  be  interpreted  as  a  degree  of  confidence  in  the  ability  of  the  project  to 
successfully  complete  within  a  schedule  of  length  x.  Applying  the  above  equation  to  estimate 
the  development  time  needed  for  a  95%  chance  of  completion  within  schedule  for  16  different 
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scenarios  simulated  using  Vit  Proiert  wp  ,  ,  , 

was  an  error  of  60  days  for  a  “T  ^  CaS= 

simulated  time  is  shown  below?  Y  “ /o  '  Th  comParison  of  estimated  time  and 
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5.  Integrating  Risk  Assessment  into  Prototyping 


degree  of  detail  to  which  requirements  enhancements  are  demonstrated  and  the  set  of 

requirements  issues  to  be  considered  in  the  next  prototyping  cycle,  if  any 
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the  fimt  versiorf  (rf'the'prototype  archhec'^T  T  “"t,1*  perf°med  af,er  ^mention  of 
efficiency  measurements  from  the  steps  ^unperformed.  °n  requireme",s  v°latility’  LGC  a"d 

done1^,:^  baa" 

similar  past  projects,  and  initial  complexity  esHmatK  cai  hih^  o'  measurem'nt5  of 

the  kind  currently  used  in  the  macro  model  a  ?  be  based  on  subjective  guesswork  of 
reliable  than  those  based  solelv  on  meas  d  approaches.  This  kind  of  estimate  may  be  less 
accurate  basis  for  deciding  whether  ornoTto^t  *  "  ^  pr°Vlde  a  PrinciPled  and  reasonably 
requirements  for  a  proposed  develop  iu  *  pr°t0typmg  process  t0  determine  the 

at  the  very  beginning  of  the  process.  proJect-  us  parts  of  our  approach  can  be  used  truly 

refine  the  inifidPesfimate?of  th^model  na^  ™eaSUrements  of.the  Process  could  be  used  to 

balanced  and  systematic  transition  from  su  Jeafvrjue^wofk^coT^ ",  meth°dS’  thuS  providinS  a 
to  assessments  increasingly  based  on  cvctp  ®  s  vodc’  coded  as  an  a  priori  distribution, 

incorporation  and  systematic  refinement  ofmm  measurem5nt-  Sllch  a"  approach  also  supports 
prototyping  process  '*r,nement  of  measurements  from  previous  cycles  of  the  iterative 

af Jdh'oree5x*ref  Pr°V‘de  guid““  d^«  to  whtch  the  project  can 

customers  by  customers.  It  can  alio  help 

improvements,  in  the  context  of  the  resi  it'  * C*  6  ,°W  mucb  ^ey  really  want  possible 

"  — 

formulation  process.  In^hrabsenSrofTn^omat  o300111^  f°rCe  t0,Stabilize  the  requirements 
cost,  stakeholders  are  prone  to  unrealistic  Zn  °n  h°W  T.~Ch  potential  enhancements  will 
always  like  to  have  a  better  svstem  no  ^  *rements  amplification  —  of  course  they  would 
them  to  pay  for  the  improvemeir’The  h(^  good  the  existing  one  is,  if  you  do  not  ask 
basis  for i^po^rSm^S 1. cZmP?  f  SKps  Proyid'  a  "=alis,ic 

When  the  situahon  is  “*0*  'arly  ia  tba  Process, 

prototyping:  the  iteratlve'procKf should' Imt'when  th S'Sh'  '"t0  ,he  dynamics  °f  iterative 
requirements  they  can  afford  to  realize  and  u  r  **  CUStomers  have  determined  what 
willing  to  pay  for  if  any  It  is  not  nere  d.^h,^h  of  many  possible  improvements  they  will  be 
final  round3 of  prototype^demonstration^is”  ^  'ha‘  °f elicited  by  the 

adequate  budgets  and  patient  customers.  P  *  '  ^  lrU'  °”ly  ™  ,deallzed  world  with 

6,  Conclusion 

This  paper  introduces  a  formal  risk  assessment  mnrioi  r  r 
probabilities  and  metrics  automatically  cn  lT  f  f°r  software  Projects  based  on 

enables  a  project  manager  HStt  fte  ^  prPJ'Ct  basal,ne-  The  aPProad> 

life  cycle,  during  an  iterative  ren,  •  P  ity  of  success  of  the  project  very  early  in  the 

measurements  rather  than  just  guesswork^  subjeefi^^udgments.00655'  °"  Wel,-defined 

lim  SBndardH  have  CharacKrized  by  a  common 

in  this  paper  '°  'S'ima"S'  This  moM  PrK““d 

inherently  variable.  imi  &  IOn,  ^acinS  reality  that  requirements  are 

same^hi"  “**"  Pr°CeSS  b“a““  « 

cyc.  With 
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be  automated  T  ??, software  process,  introducing  a  risk  assessment  step  that  can 

is  st,  rrnvr/ha  Can  he  r  Shape  the  Plannins  °f  the  Pr0Ject  in  the  early  stages  when  there 
is  still  substantial  freedom  to  allocate  available  time  and  budget. 
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Abstract 

This  paper  describes  a  distributed  development  environment,  CAPS  (Computer-Aided 
Prototyping  System),  to  support  rapid  prototyping  and  automatic  generation  of  source  code 
based  on  designer  specifications  in  an  evolutionary  software  development  process.  The  CAPS 
system  uses  a  fifth-generation  prototyping  language  to  model  the  communication  structure, 
timing  constraints,  I/O  control,  and  data  buffering  that  comprise  the  requirements  for  an 
embedded  software  system.  The  language  supports  the  specification  of  hard  real-time  systems 
with  reusable  components  from  domain  specific  component  libraries.  CAPS  has  been  used 
successfully  as  a  research  tool  in  prototyping  large  real-time  control  systems  (e.g.  the 
command-and-control  station,  cruise  missile  flight  control  system,  missile  defense  systems)  and 
demonstrated  its  capability  to  support  the  development  of  large  complex  embedded  software. 


1.  Introduction 

Studies  have  shown  that  early  parts  of  the  system  development  cycle  such  as  requirements 
and  design  specifications  are  especially  prone  to  errors  [1],  Problems  originated  in  the  early 
stages  often  have  a  lasting  influence  on  the  reliability,  safety  and  cost  of  the  system. 
Evolutionary  prototyping  offers  an  iterative  approach  to  requirements  engineering  to  alleviate 
the  problems  of  uncertainty,  ambiguity  and  inconsistency  inherent  in  the  process.  Moreover, 
prototyping  can  improve  the  capture  of  change  in  requirements  and  assumptions  during  the 
development  process.  This  effect  is  particularly  observed  in  projects  involving  multiple 
stakeholders  with  different  points  of  view  [4,  15]. 

Evolutionary  driven  computer  aided  software  engineering  (CASE)  tools  for  computer-aided 
prototyping  provide  logical  assessment  of  the  consistency  and  clarity  of  requirements  and 
specifications.  Prototypes  facilitate  the  requirements  phase  in  any  type  of  software  projects. 
Particularly,  in  real-time  applications  where  severe  time  constraints  impose  more  challenges, 
the  use  of  prototypes  helps  to  describe  the  requirements  in  a  clear,  precise,  consistent  and 
executable  format.  Prototypes  can  demonstrate  system  scenarios  to  the  affected  parties  as  a  way 
to.  a)  collect  criticisms  and  feedback  for  updated  requirements;  b)  early  detection  of  deviations 
from  users  expectations;  c)  trace  the  evolution  of  the  requirements;  d)  improve  the 
communication  and  integration  of  the  users  and  the  development  personnel;  and  e)  provide 
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early  warning  of  mismatches  between  proposed  software  architectures  and  the  conceptual 
structure  of  requirements.  H 

The  benefits  of  prototyping  are  widely  accepted.  All  modern  life  cycle  models  such  as 
oehms  spiral  [2],  Luqi's  graph  model  [9],  rapid  application  development  (RAD),  etc  are 
based  on  prototyping.  Experience  suggests  that  building  and  integrating  software  by 
mechanically  processable  formal  models  leads  to  cheaper,  earlier  and  more  reliable  products 
[  3].  Bernstein  estimated  that  for  every  dollar  invested  in  prototyping,  one  can  expect  a  $1  40 
return  within  the  life  cycle  of  the  system  development  [3],  To  be  effective,  prototypes  must  be 
constructed  and  modified  rapidly,  accurately,  and  cheaply.  Software  for  rapid  and  inexpensive 
construction  and  modification  of  prototypes  makes  it  feasible  [10,  11], 

2.  The  Computer  Aided  Prototyping  System  (CAPS) 


The  Computer-Aided  System  (CAPS),  a  research  tool  developed  at  the  Naval  Postgraduate 
c  ool.  is  an  integrated  set  of  software  tools  that  generate  source  programs  directlv  from  hi»h 
level  requirements  specifications  (Figure  1)  [8],  CAPS  provides  the  following  kind's  of  support 
to  the  prototype  designer:  a)  timing  feasibility  checking  via  the  scheduler;  b)  consistency 
checking  and  automated  assistance  for  project  planning,  configuration  management,  scheduling, 
designer  task  assignment,  and  project  completion  date  estimation  via  the" Evolution  Control 
system;  c)  computer-aided  design  completion  via  the  editors;  d)  computer-aided  software  reuse 

via  the  software  base;  and  e)  automatic  generation  of  wrapper  and  glue  code  via  the  execution 
support  system. 


PSDL  editor 
Ada  editor 
GUI  editor 


Translator  Evolution  Control 

Scheduler  Change  Merger 

Compiler  Risk  Assessment 


Figure  1.  The  CAPS  rapid  prototyping  environment 


The  efficacy  of  CAPS  has  been  demonstrated  in  many  research  projects  (e.g.  the  command- 
and-control  station,  cruise  missile  flight  control  system,  SIDS  wireless  acoustic  monitor,  and 
missile  defense  systems)  at  the  Naval  Postgraduate  School  and  other  facilities. 

There  are  four  major  stages  in  the  CAPS  rapid  prototyping  process:  software  system  design, 
construction,  execution,  and  requirements  evaluation/modification  (Figure  2). 

The  initial  prototype  design  starts  with  an  analysis  of  the  problem  and  a  decision  about 
which  parts  of  the  proposed  system  are  to  be  prototyped.  Requirements  for  the  prototype  are 
then  generated,  either  informally  (e.g.  English)  or  in  some  formal  notation.  These  requirements 
may  be  refined  by  asking  users  to  verify  their  completeness  and  correctness. 

After  some  requirements  analysis,  the  designer  uses  the  CAPS  PSDL  editor  to  draw  dataflow 
diagrams  annotated  with  nonprocedural  control  constraints  as  part  of  the  specification  of  a 
hierarchically  structured  prototype,  resulting  in  a  preliminary',  top-level  design  free  from 
programming  level  details.  The  user  may  continue  to  decompose  any  software  module  until  its 
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~ompoClbB'  re3liZed  reUSab"  COn3P°"“S  from  *•  b-  or  new 

evateioTS, i!THran1ared  in'°  Programming  language  for  execution  and 

,®  d  mod,ficatlon  ““I'ze  a  design  database  that  assists  the  designers  in 
“  =  *  des,gn  l,lstory  2nd  coordinating  change,  as  well  as  other  tools  shown  in  Figure  1 . 


Generate  initial 
requirements 


Modify 

requirements 


Construct /modify 
prototype  design 


Generate  target 
source  code 


Demonstrate 

Prototype 


Reusable 

Software 


Execution 

Support 

System 


DBMS 

Software  Design 
Database  Database 


Figure  2.  Iterative  prototyping  process  in  CAPS 


3.  Application  of  CAPS  in  an  evolutionary  software  process 
3.1  CAPS  as  a  Requirements  Engineering  tool 

witIhH-ffqU,?TntS  forasoftware  SyStem  are  exPressed  at  different  levels  of  abstraction  and 

fa.~STt  HeSreCS  °fI0rma,,ty-  The  higheSt  level  requirements  are  usually  informai  and 
imprecise  but  they  are  understood  best  by  the  customers.  The  lower  levels  are  more  technical 

femnvt’/f d  ^  ^  °f  the  system  anaI-vsts  and  designers,  but  they  are  further 

diffeTnll°mH  tUS!,rS  e^enences  and  ,ess  we"  understood  by  the  customers.  Because  of  the 
fferences  in  the  kinds  of  descriptions  needed  by  the  customers  and  developers,  it  is  not  likely 

*  any  single  representation  for  requirements  can  be  the  “best”  one  for  supporting  the  entire 
software  development  process.  CAPS  provides  the  necessary  means  to  bridge  the 
communication  gap  between  the  customers  and  developers.  The  CAPS  tools  are  based  on  the 
Prototype  System  Description  Language  (PSDL),  which  is  designed  specifically  for  specifying 
hard  real-time  systems  [6,  7].  It  has  a  rich  set  of  timing  specification  features  and  offers  a 
common  baseline  from  which  users  and  software  engineers  describe  requirements.  The  PSDL 
descriptions  of  the  prototype  produced  by  the  PSDL  editor  are  very  formal,  precise  and 

beT^UOfu  meet,ng  f  needs  of  the  sys*™  a"alysts  and  designers.  The  demonstrated 
behavior  of  the  executable  prototype,  on  the  other  hand,  provides  concrete  information  for  the 

customer  to  assess  the  validity  of  the  high  level  requirements  and  to  refine  them  if  necessary. 


3.2  CAPS  as  a  System  Testing  and  Integration  tool 

Pr0CeSS  S“WOr“d  by  CAPS  P-ides  requirements  and 
provides  an  e“e  ^  Tb* 

during  system  testincr.  The  existent  fi  ^Ulrements  that  can  be ;  used  for  comparison 

amft^tfng  canlbeolif  be^eTll  of  theTubCm  t”^0 ^  ®^^ivered*lnte^tion 

of  the  completed  subsystems  with  Prot°type  ve^ion^o^t^ff^rt^tha^e'stnfbeing  devebped!S 

3.3  CAPS  as  an  Acquisition  tool 

reai-,ime  «“»*  -  ** b— 

ch:;du“rem'< ierlopers' !s 

Tv  promfse  CAP?  rr  T/*0"  ^  Stay  ,ra<*  a"d  ‘ha>  contractors  del™ wta 
reducing  the  risk  of  co2B^forSS^r‘5  V‘a  Pr°,0typi"S  “'"’“"Nation,  greatly 

3.4  CAPS  as  a  Risk  Assessment  tool 

„  Ju' u“  °f  P™0^'5  '"“duces  a  problem  for  project  planning  because  of  the  uncertain 
complexity  ZThouTd  bet  “T*  «“*"“*  «“  P°od“  °"d  *~Tof 

2l?IS-=SS~SS 

obtained  bv  the*  qpop  t  •  program  integration  step  transforms  the  modules 

obtained  by  the  generator  are  integrated  into  a  program,  possibly  addin*  code  created  bv 

The8narnmerS  ana  rc“Sable  comP°"C"<s.  This  step  includes  integration  testing  and  debugging 

I  h ^er' tnltStt TT"  “  3  pro‘MyPa'  -  ~o  po’ssiW. SSSj 
and  expectate  oflf '  ,  '"“duces  cnttcisms,  or  b)  the  product  matches  the  needs 
Critickme  S  cusTer-  ln  the  flret  case,  the  process  continues  by  analyzing  the 

■he  »  aph  t  Z  ”  T  IT  SKP  ,,,a,  pr°d““s  new  issuts  "losing Vernal  cyde  in 

~  Tai^rr/rf  durin/  a  produc'  ™p'»-»-V  s 

graph  “  dUr'"S  2  Pr0dllc'  demo  S,'P  c'os*"8  the  internal  cycle  of  the 

.he^^T'^  Pr°,0,yPi"g  P'0Cess  b>'  'ntr°ducing  a  new  vertex  in 

amomadcX  die"  fte!  ^T'"'  TP  <Fig“re  4)  [,4J'  A  risk  cessment  step  can  be 

needed  to  derive  the  coin  °mp'e,'on  of',K  specifications.  CAPS  provides  the  automation 
to  derive  the  complex, ty  of  the  product  from  the  PSDL  specifications.  This  derivation 
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will  be  used  together  with  personnel  and  organizational  information,  and  with  metrics  of 
requirements  collected  from  the  baselines,  to  produce  the  risk  assessment.  The  requirements 
analysis  step  integrates  these  measures  with  issues  in  the  issue  analysis  steps 


Figure  3.  A  typical  evolutionary  prototyping  software  process. 


Figure  4.  The  improved  evolutionary  prototyping  software  process. 


4.  A  simple  example:  prototyping  a  C3I  workstation 


'"cAPs^f? ini,ial  rooTZ  <  “h  • 

EB5^S“=SS~ 

allows  the  user  ^o^soedf^Tf313  'S  dama§e*  serv'ce_required,  or  out_of_ammunition.  CAPS 

operator 

modul  vviM  be  moleme'  f **  C3i-SyStem  pr0t0t>Pe’ the  ««  must  specify  how  each 

opera to  the  implementatio"  language  for  the  module  via  the 

operator  property  menu.  The  .mplementat.on  of  a  module  can  be  in  either  the  target 

A  m0dUle  Tth  ^  imPlementation  in  the  target  program^ 
n  uase  is  called  an  atomic  operator.  A  module  that  is  decomposed  into  a  PSnf 

implementation  is  called  a  composite  operator.  Module  decomposition  can  be  done  by  select^ 
the  corresponding  operator  m  the  tree-panel  on  the  left  side  of  the  PSDL  editor 

comnn  “TZi  '°  mplement  a!I  nine  modules  as  atomic  operators  (usin-  dummy 

a$  t0  CheCk  °Ut  the  gl°bal  6ffeCtS  the  timing  and  coZ. 

detailed  «„h«  ♦  ’  ^  5  ™7  C1°°Se  t0  decomPose  the  comms_interface  module  into  more 
detai  ed  subsystems  and  implement  the  sub-modules  with  reusable  components  while  leavin* 

the  others  as  atomic  operators  in  the  second  version  of  the  prototype,  and  so  on 

sys^m^^Ttf^  pr°t0,ypeS-Cf S  Prides  the  user  with  an  execution  support 
system  tha  consists  of  a  translator,  a  scheduler  and  a  compiler.  Once  the  user  finishes 

specifying  the  prototype,  he/she  can  invoke  the  translator  and  the  scheduler  from  the  CAPS 
mam  interface  to  analyze  the  timing  constraints  for  feasibility  and  to  teneZ  a  Jervis  or 
moduie  for  each  subsystem  of  the  prototype  in  the  tarae,  programm  ouaT  Each 
superv'sor  mod“  e  consists  of  a  set  of  driver  procedures  th.tre.lfee  all  the  control  constraints  a 

fchionTdatf  °  StaHC  scl'edllle)  «“«“«  l'1'  time-critical  operators  in  a  timely 
fashion  and  a  low  priority  dynamic  schedule  task  that  executes  the  non-time-critical  ooerators 

when  there  is  time  available.  The  supervisor  module  also  contains  information  that  enables  the 
compder  to .incorporate  all  the  software  components  required  to  implement  the  atomic  opS, or 
and  genera  e  the  b, nary  code  automatically.  The  translator/scheduler  also  genere,  s  thetlue 
code  needed  for  timely  delivery  of  information  between  subsystems  across  thf  tar^eftvo* 
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|  OPERATOR  c3i^system 
I  SPECIFICATION 

DESCRIPTION 

{This  module  implements  a  simplified  vers<or  of 
a  generic  C3I  workstation  . 

END 

IMPLEMENTATION 

GRAPH 
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DATA  STREAM 
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Figure  7.  Top-level  specification  of  the  c3i_system 

mlM^tTf"hiCT""'  sophisticated  graphic  user  interfaces,  the  CAPS  main  interface 
cTsv!V  “  °r  ,0  ln,erac,‘ve|y  sculp,  that  interface.  In  the  final  version  of  the 

tracked  a  tnhn  prototype’  choose  to  decompose  the  commsjnterface,  the 

hierarchical  ^e-manaSer  an  user_interface  modules  into  subsystems,  resulting  in 

“SL-2?  cons.st.ng  of  8  composite  operators  and  twenty-six  atomic  operators.  The 

correTnnS  1a  P  °tyP?  haS  3  t0tal  of  14  panels’  four  of  which  are  shown  in  Figure  8.  The 
P  mg  a  program  has  a  total  of  10.5K  lines  of  source  code.  Among  the  10.5K  lines 


of  code,  3.5K  lines  come  from  supervisor  module  that  was  generated  automatically  by  the 
translator/scheduler  and  1.7K  lines  that  were  automatically  generated  by  the  interface  editor 
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Figure  8.  User  interface  of  the  c3i_system 


To  evaluate  the  benefits  derived  from  the  practice  of  computer-aided  prototyping  within  the 
software  acquisition  process,  we  conducted  a  case  study  in  which  we  compared  the  cost  (in 
dollar  amounts)  required  to  perform  requirements  analysis  and  feasibility  study  for  the  c3i 
system  using  the  Mil-Std  2167A  process,  in  which  the  software  is  coded  manually,  and  the 
rapid  prototyping  process,  where  part  of  the  code  is  automatically  generated  via  CAPS  [5]. 

We  found  that,  even  under  very  conservative  assumptions,  using  the  CAPS  method  resulted 
in  a  cost  reduction  of  $56,300,  a  27%  cost  saving.  Taking  the  results  of  this  comparison,  then 
projecting  to  a  mission  control  software  system,  the  command  and  control  segment  (CCS),  we 
estimated  that  there  would  be  a  cost  saving  of  12  million  dollars.  Applying  this  concept  to  an 
engineering  change  to  a  typical  component  of  the  CCS  software  showed  a  further  cost  savings 
of  $25,000. 

5.  Conclusion 


CAPS  has  been  used  successfully  as  a  research  tool  in  prototyping  large  war-fighter  control 
systems  and  demonstrated  its  capability  to  support  the  development  of  large  complex  embedded 
software.  Specific  payoffs  include: 

(1)  Formulate/validate  requirements  via  prototype  demonstration  and  user  feedback, 

(2)  Assess  feasibility  of  real-time  system  designs, 

(3)  Enable  early  testing  and  integration  of  completed  subsystems, 

(4)  Support  evolutionary  system  development,  integration  and  testing, 

(5)  Reduce  maintenance  costs  through  systematic  code  generation, 

(6)  Produce  high  quality,  reliable  and  flexible  software, 

(7)  Avoid  schedule  overruns. 
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The  Use  of  Computer  Aided  Prototyping  for 
Re-engineering  Legacy  Software 

Luqi,  V.  Berzins,  M.  Shing,  M.  Saluto,  J.  Williams 
J.  Guo  and  B.  Shultes 

Abstract 

Re-engineering  is  typically  needed  when  a  system  performing  a  valuable  service  must 
change,  and  its  cutrent  implementation  can  no  longer  support  cost-effective  changes.  The 
process  of  re-engineering  old  procedural  software  to  a  modem  object-oriented 
architecture  introduces  certain  complexities  into  the  software  analysis  process.  The  direct 
products  of  reverse  engineering,  such  as  requirements  or  design  specifications,  are  likely 
to  have  a  functionally  based  structure.  As  a  result,  some  transformation  of  the  recovered 
requirements  and  design  specifications  is  necessaty  in  order  to  obtain  specifications  for 
the  new  structures.  It  is  often  very  difficult  to  quickly  determine  if  the  transformed 
specification  is  a  true  representation  of  the  desired  requirements.  This  paper  discusses  the 
effective  use  of  computer-aided  prototyping  techniques  for  re-engineering  legacy 
software,  and  presents  results  of  a  case  study  which  showed  that  prototyping  can  be  a 
valuable  aid  in  re-engineering  of  legacy  systems,  particularly  in  cases  where  radical 
changes  to  system  conceptualization  and  software  structure  are  needed.  The  CAPS 
system  enabled  us  to  do  this  with  a  minimal  amount  of  coding  effort. 


1.  INTRODUCTION 

7’ **  „„ 

J  ”ed  reqU‘rementS’  design  dedsi0“.  invaluable  advice  and  suggestions  front 
m  users  that  have  been  unplemented  over  the  years.  To  effectively  use  these  assets 

“  “  ‘°  emP'°y  3  ~C  continued  evolution  of  the  curren,’ 

system  to  meet  the  ever-changing  mission,  technology  and  user  needs.  Re-engineering 

as  frequently  been  proven  ,0  be  more  cos,  effective  than  new  development  and  is  also 
own  to  better  promote  continuous  software  evolution. 

However,  the  institutional  knowledge  implicit  in  a  legacy  system  is  difficult  to 
-over  after  many  years  of  operation,  evolution,  and  personnel  change.  These  software 

modular  structure,  and  coherent  abstractions  tha,  correspond  to  cuiren,  or  projected 

requirements.  Pas,  optimizations  and  design  changes  have  spread  design  decisions  that 

now  must  be  changed  over  large  areas  of  th*  ^ 

code,  and  may  have  introduced 

inconsistencies  and  faults. 

Software  re-engineering  can  be  define  « 

fined  as  the  systematic  transformation  of  an 

existing  system  into  a  new  form  to  realize  quality  improvements,  such  as  increased  or 
enhanced  functionality,  better  maintainability,  configurability,  reusability,  perfoimance 
or  evolvabihty  a,  a  reduced  cost,  schedule,  or  risk  to  the  customer.  This  process  invo,ves 
recovering  extsting  software  artifacts  from  the  system  and  then  transforming  and  re¬ 
gents  were  original,  des.gned  and  indented  using  a  functional,  based  approach 
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some  transformation  of  the  recovered  inform *+• 

recovered  information  is  necessary  in  order  to  obtain  an 

object-oriented  model.  It  is  often  A;cr  u 

ry  difficult  to  obtain  a  transformed  specification  that 

accurately  represents  the  desired  requirements. 

Since  legacy  systems  are  usually  re-engineered  only  when  the  existing  systems  need 
some  land  of  improvement,  i,  is  unlikely  that  the  initial  version  of  the  recons, raced 
requirements  adequately  reflects  curcen,  user  needs.  Prototyping  provides  a  means  to 
identify  and  validate  changes  to  system  requirements  while  simultaneously  enabling 
prospective  users  to  get  a  feel  for  new  aspects  of  the  proposed  system.  It  is  a  well- 
established  approach  that  can  be  highly  effective  in  increasing  software  quality  [15], 
When  used  in  conjunction  with  conducting  a  major  re-engineenng  effort,  prototyping  can 
be  extremely  useful  in  assisting  in  many  areas  of  software  modification,  validation,  risk 
reduction,  and  the  refinement  of  new  software  architectures  and  user  requirements. 

This  paper  describes  a  case  study  that  illustrates  the  effective  use  of  computer-aided 
prototyping  techniques  for  re-engineering  legacy  software  ,3,  ,6,  ^  case  study 

consists  of  developing  an  object-oriented  modular  architecture  for  the  existing  US  Army 
Janus(A)  combat  simulation  system  tI9J,  and  validating  the  architecture  via  an 
prototype  ustng  the  Computer  Aided  Prototyping  System  (CAPS),  a  research 
tool  developed  a,  the  Nava,  Postgraduate  School  [14],  Janus(A,  is  a  software-based  war 
game  that  simulates  grcund  baft.es  between  up  to  six  adversaries  [9], is  an  interactive, 
closed,  stochastic,  ground  combat  simulation  with  color  graphics.  Janus  is  "interactive"  in 
.bat  command  and  control  functions  are  entered  by  military  analysts  who  decide  wha,  to 
do  in  crucial  situations  during  simulated  combat.  The  current  version  of  Janus  operates  on 
a  Hew, eft  Packed  wodcstation  and  consists  of  over  350,000  lines  of  FORTRAN  code. 
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fortran  modules  are  organized  as  a  flat  structure  and  intersected  with  one 

-her  via  „  FORTRAN  COMMON  biocks,  resulting  i„  a  sohware  structure  tha, 

makes  modification  to  Janus  very  costlv  and  * 

y  nd  error-prone.  The  Software  Engineering 

group  at  the  Naval  Postgraduate 

graduate  School  was  tasked  to  extract  the  existing  functional^ . 

through  reverse  engineering  and  to  create  a  w  r  ,  . 

b  line  object-oriented  architecture  that 

supports  existing  and  required  enhancements  to  Janus  functionality. 

The  paper  presents  the  re-architecturino  nrocess  and  ti 

process  and  the  resultant  object-oriented 

prototyping  to  va.idate  the  resultant  architecture  and  Section  5  draws  some  conclusions. 

2.  REVERSE  ENGINEERING 


The  re-architecturino 


process  used  in  the  case  study  consists  of  3 


engineering,  object-oriented  design  and  design  validation  v 


major  phases:  reverse 


via  prototyping  (Figure  1). 


Reverse  Engineerim 


Obiect-orientpH 
Design  . 

domain  expen 
feecthack 


Design  Validation 
via  Prototyping 


source  code,  dataflow  diagrams./  objeci 
design  documents,  structure  chans  ' 

user  manual, 
domain  experts 


executable 

prototype 


executable 

prototype 


targe,  00 

- - - - - I  L  system  implementation  ] 


Figure  1.  The  object-oriented  re-architecturing 


process. 
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The  first  phase  is  reverse  engineering.  Input  to  this  phase  includes  the  legacy  source 
code,  desien  documents,  user  manuals,  and  information  from  domain  experts.  Since  the 
goal  of  the  initial  re-engineering  effort  is  to  duplicate  the  functionality  of  the  existing 
system  within  a  modular,  extensible  architecture  and  to  reuse  domain  concepts,  models 
and  algorithms  instead  of  the  existing  code,  we  should  avoid  including  any 
requirements/constraints  that  are  consequences  of  issues  related  to  FORTRAN 
implementation.  The  best  places  to  extract  domain  concepts  from  the  existing  system  are 
the  user  manuals  and  the  database  management  system  manuals.  These  manuals  were 
written  using  the  lingo  of  the  user  community  and  should  be  relatively  free  of 
implementation  details.  We  found  the  JANUS  Data  Base  Management  Program  Manual 
[10]  particularly  useful  because  it  contains  detailed  information  on  what  kind  of  data  are 
needed  to  model  the  battlefield  and  how  they  are  organized  (logically)  in  the  database. 
The  top-level  structure  of  the  database  is  shown  in  Figure  2. 

Not  shown  in  Figure  2  are  the  interdependencies  between  the  data,  whereby  data 
entered  in  one  category  affect  directly  or  indirectly  the  data  in  other  categories.  For 
example,  the  barrier  delay  attributes  of  the  Engineer  Data  depend  on  specific  weather 
conditions  derived  from  the  Weather  Data  and  system  functional  characteristics  derived 
from  the  System  Data.  The  overall  network  of  interdependencies  is  highly  complex  and 

can  only  be  understood  through  construction  and  analysis  of  a  functional  model  of  the 
existing  Janus  software. 
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Figure  2.  The  top-level  structure  of  the  Janus  Database. 
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Analysis  of  the  legacy  implement  is  a  daunting  but  inescapable  part  of  this  step. 
If  printed  ou,  at  60  lines  per  page.  350,000  lines  would  fill  almost  6000  pages.  We 
recoiled  from  the  magnitude  of  this  effort  and  analyzed  the  Janus  User's  manual  [9],  the 
w  Programmer's  Manual  [7],  the  Janus  Software  Design  Manual  [8],  and  the  Janus 
Algorithm  Document  (18,  Instead.  These  documents  helped  us  8e,  started  because  they 
contained  higher  level  information  and  were  much  shorter  than  the  code.  However,  they 
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were  also  older,  and  i,  was  a  constant  snuggle  ,o  determine  which  parts  were  still 
accurate,  and  which  were  not.  In  hindsight,  avoiding  analysis  of  the  code  was  a  misrnke 
that  slipped  the  schedule  of  the  project  by  several  months.  Understanding  a  design  of  this 

-pirn.  We  should  have  started  analysis  of  the  code  right  away  and  should  have 

persistently  continued  this  task  in  parallel  with  nii 

P  ith  all  other  re-engineering  activities.  Cross¬ 

fertilization  between  all  the  task*  «miu  u  ,  , 

tasks  would  have  helped  us  recognize  some  dead-end 

directions  earlier  and  would  have  enabled  us  spend  meeting  time  more  effectively. 

Using  manual  techniques  augmented  with  simple  UNIX  shell  commands,  we  were 
able  to  walk  through  the  code  and  ge,  a  fairly  good  idea  of  wha,  each  subroutine  was 
designed  to  do.  We  also  used  the  Software  Programmers'  Manual  [7]  aid  in 
understanding  each  subroutine,  function.  doing  so  we  were  able  to  group  the 
subrout, nes  by  fimcttonahty  to  get  a  better  understanding  of  the  major  data  flows  between 
programs  and  develop  functional  models  from  the  data  flows.  We  used  CAPS  to  assist  in 
developing  the  abstract  models.  CAPS  allowed  us  to  rapidly  graph  the  gathered  data  and 
transform  i,  into  a  more  readable  and  usable  forma,.  Additionally,  CAPS  enabled  us  to 
oncuirently  develop  our  diagrams,  and  then  join  them  together  under  the  CAPS 
environment,  where  they  can  be  used  to  generate  an  executable  model. 

We  also  had  a  series  of  brief  meetings  with  the  client,  TRAC-Montetey,  asking 

questions  and  making  notes  on  Are  system's  operation  and  its  current  functionality.  We 

psid  attention  to  the  clientfs  vipw  nf* 

vtew  of  the  system  to  gather  their  ideas  on  its  strengths, 

weaknesses,  and  desired  and  undesired  functionalitv  ,• 

nctionality.  These  meetings  were  indispensable 

because  they  gave  us  information  that  was  not  present  in  th.  ,  w  c- 

present  in  the  code.  Since  we  were  not 
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‘  'ar  with  the  domain  of  ground  combat  simulation,  we  were  using  these  meetings  to 
determine  the  requirements  of  this  domain,  often  playing  the  role  of  "smart  ignoramuses" 
[4].  Domain  analysis  has  been  identified  as  an  effective  technique  for  software  re¬ 
engineering  [17].  Our  experience  suggests  that  competent  engineers  unfamiliar  with  the 
application  domain  have  an  essential  role  in  re-engineering  as  well  as  in  requirements 
elicitation  because  lack  of  inessential  information  about  the  application  domain  makes  it 

easier  to  find  new,  simpler  design  structures  and  architectural  concepts  to  guide  the  re¬ 
engineering  effort. 

3.  OBJECT-ORIENTED  DESIGN 

Next,  we  developed  object  models  and  architecture  of  the  Janus  System  using  the 
aforementioned  materials  and  products,  to  create  the  modules  and  associations  amongst 
them.  Information  modeling  is  needed  to  support  effective  re-engineering  of  complex 
systems  [5],  This  was  probably  the  most  difficult  and  most  important  phase.  It  required  a 
great  deal  of  analysis  and  focus  to  transform  the  currently  scattered  sets  of  data  and 
functions  into  small,  coherent  and  realizable  objects,  each  with  its  own  attributes  and 
operations.  In  performing  this  phase,  we  used  our  knowledge  of  object-oriented  analysis 
and  applied  the  OMT  techniques  [20]  and  the  UML  notations  to  create  the  classes  and 
associated  attributes  and  operations  [21],  This  was  a  crucial  phase  because  we  had  to 

ensure  that  the  classes  we  created  accurately  represented  the  functions  and  procedures 
currently  in  the  software. 

Restructuring  software  to  identify  data  abstractions  is  a  difficult  part  of  the  process. 
Transformations  for  meaning-preserving  restructuring  can  be  useful  if  tool  support  is 
available  [6],  We  used  the  HP-UNIX  systems  at  the  TRAC-Monterey  facility  to  run  the 
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Janus  simulation  software  to  aid  in  verifying  and  supplementing  the  information  we 
obtained  from  reviewing  the  source  code  and  documentation.  This  step  enabled  us  to 
better  analyze  the  simulation  system,  gaining  insight  into  its  functionality  and  further 
concentrate  on  module  definition  and  refinement. 

The  re-engineering  team  met  several  times  each  week  for  a  period  of  two  and  a  half 
months  to  discuss  the  object  models  for  the  Janus  core  data  elements  and  the  object- 
oriented  architecture  for  the  Janus  System.  We  presented  the  findings  to  the  Janus  domain 
experts  at  least  once  per  week  to  get  feedback  on  the  models  and  architectures  being 
constructed.  In  addition,  the  re-engineering  team  also  presented  the  findings  to  members 
of  the  OneSAF  project,  the  Combat21  project,  and  the  National  Simulation  Center 
project.  We  found  that  information  from  these  domain  experts  was  essential  for 
understanding  the  system,  particularly  in  cases  where  the  legacy  code  did  not  correspond 
to  stakeholder  needs.  This  supports  the  hypothesis  advanced  in  [11]  that  the  involvement 
of  domain  experts  is  critical  for  nontrivial  re-engineering  tasks. 

Early  involvement  of  the  stakeholders  in  the  simulation  community  also  paid  off  in 
the  long  run.  Both  the  National  Simulation  Center  and  Combat21  projects  were  able  to 
save  time  and  money  by  reusing  our  work  and  came  up  with  designs  that  look  remarkably 
like  ours  (although  much  larger).  Now.  OneSAF  developers  have  been  directed  to  look  at 
the  Combat21  class  design  and  reuse  as  much  as  possible.  So,  our  efforts  have  directly 
benefited  other  simulation  developers. 

Based  on  the  feedback  from  the  domain  experts,  the  re-engineering  team  revised  the 
object  models  for  the  Janus  core  elements  and  developed  a  3-tier  object-oriented 
architecture  for  the  Janus  System  (Figure  3). 
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Figure  j.  The  proposed  3 -tier  object-oriented  architecture. 


We  extracted  most  of  the  data  and  operations  from  the  existing  Combat  System 
DBMS,  Scenario  Management,  Janus  Combat  Simulation,  JAAWS  and  POSTP 
subsystems  and  encapsulated  them  as  simulation  objects  in  the  Core  Elements  package, 
leaving  only  application  specific  control  codes  that  use  the  simulation  objects  in  each  of 
these  five  subs\ stems.  Figures  4  and  5  show  the  top  level  class  structures  of  the  object 
models  of  the  core  elements.  Details  of  the  associated  attributes  and  operations  can  be 
found  m  [2,  23]  and  are  omitted  from  these  diagrams  due  to  space  limitations. 


Central  to  the  Janus  Combat  Simulation  Subsystem  is  the  program  RUNJAN,  which 
is  the  main  event  scheduler  for  the  simulation.  RUNJAN  determines  the  next  scheduled 
event  and  executes  that  event.  If  the  next  scheduled  event  is  a  simulation  event,  RUNJAN 
will  advance  the  game  clock  to  the  scheduled  time  of  the  event  and  perform  that  event. 
The  existing  Janus  Simulation  System  uses  17  different  categories  to  characterize  the 
events.  RUNJAN  then  handles  these  17  events  using  the  following  event  handlers: 

0  DOPLAN  -  Interactive  Command  and  Control  activities 

2)  MOVEMENT  -  Update  unit  positions 

3)  DOCLOUD  -  Create  and  update  smoke  and  dust  clouds 

4)  STATEWT  -  Periodic  activity  to  write  unit  status  to  disk 

5)  RELOAD  -  Plan  and  execute  the  direct  fire  events 

6)  INTACT  -  Update  the  graphics  displays 

7)  CNTRBAT  -  Detect  artillery  fire 

8)  SEARCH  -  Update  target  acquisitions,  choose  weapons  against  potential  targets, 
and  schedule  potential  direct  fire  events 

9)  DOCHEM  -  Create  chemical  clouds  and  transition  units  to  different  chemical  states 

10)  FIRING  -  Evaluate  direct  fire  round  impacting  and  execute  indirect  fire  missions 

11)  IMPACT  -  Evaluate  and  update  the  results  of  an  indirect  round  impacting 

12)  RADAR  -  Update  an  air  defense  radar  state  and  schedule  direct  fire  events  for 
“normal”  radar 

13)  COPTER  -  Update  helicopter  states 
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14)  DOARTY  -  Schedule  indirect  fire  missions 

15)  DOHEAT  -  Update  unit’s  heat  status 

1 6)  DOCKPT  -  Activity  to  record  automatic  checkpoints 

1 7)  ENDJAN  -  Housekeeping  activity  to  end  the  simulation 

The  existing  event  scheduler  uses  global  arrays  and  matrices  to  maintain  the  attributes 
of  the  objects  in  the  simulation.  Hence,  one  of  the  major  tasks  in  designing  an  object- 
oriented  architecture  for  the  Janus  Combat  Simulation  Subsystem  was  to  distribute  the 
event  handling  functions  to  individual  objects.  However,  many  of  the  current  event 
handler  categories  contained  redundant  code.  They  did  not  seem  to  be  independent  of 
each  other  and  were  not  consistent  with  the  class  hierarchy  we  created.  For  example,  the 
set  of  event  handlers  used  to  simulate  the  activities  of  a  particular  unit  to  search  for 
targets,  select  weapons,  prepare  for  a  direct  fire  engagement,  and  then  execute  that  direct 
fire  engagement  differs  depending  upon  whether  the  unit  has  a  normal  radar,  special 
radar,  or  no  radar  at  all.  The  existing  Janus  Simulation  System  uses  the  RADAR  event 
handler  to  carry  out  the  entire  procedure  if  the  unit  has  normal  radar.  However,  it  uses 
the  SEARCH,  RADAR,  and  RELOAD  event  handlers  to  carry  out  the  procedure  if  the 
unit  has  special  radar.  Finally  the  system  uses  the  SEARCH  and  RELOAD  event 
handlers  to  conduct  the  procedure  if  the  unit  has  no  radar  at  all.  We  conjecture  that  this 
lack  of  uniformity  is  due  to  a  series  of  software  modifications  made  by  different  people  at 
different  times  without  full  knowledge  of  the  software  structure.  The  example  also 
illustrates  another  problem:  the  legacy  event  handlers  were  not  designed  to  perform 
independent  tasks,  and  had  complicated  interactions  with  each  other. 
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It  was  necessary  to  redefine  some  event  categories  in  order  to  reduce 
interdependencies  between  the  event  handlers,  to  factor  simulation  behavior  into  more 
coherent  modules,  to  eliminate  redundant  coding  of  the  same  or  similar  functions  and  to 
take  advantage  of  dynamic  dispatching  of  event  handling  functions  in  the  object-oriented 
architecture.  Moreover,  the  Janus  system  was  originally  designed  to  work  in  isolation, 
and  has  since  been  adapted  to  interact  with  other  simulation  systems.  Interactions 
between  the  simulation  engine  and  the  world  modeler  (the  interface  to  the  distributed 
simulation  network)  are  performed  implicitly  within  the  various  event  handlers  in  the 
existing  Janus.  Such  interactions  are  made  explicit  in  the  new  architecture  in  order  to 
provide  a  uniform  framework  to  update  World  Model  objects  during  the  simulation. 

The  new  architecture  uses  an  explicit  priority  queue  of  event  objects  to  schedule  the 
simulation  events.  We  were  able  to  reduce  the  total  number  of  event  handlers  needed  in 
the  simulation,  from  17  to  14,  by  eliminating  identified  redundant  code  (Figure  6).  The 
14  remaining  event  handlers  are  as  follows: 

1 )  DOPLAN  -  Interactive  Command  and  Control  activities 

2)  MOVE_UPDATE_OBJ  —  Move  and  update  the  objects  in  the  simulation 

j)  SEARCH  —  Search  for  potential  targets  based  on  the  detection  devices  available  to 
the  objects 

4)  CHOOSE_DIRECT_FIRE_TARGETS  —  Once  search  is  complete,  choose  best 
target  to  engage.  In  future  simulations,  implementations  may  allow  users  to  choose 
targets 

5)  COUNTERBATTERY  —  Simulate  counter  battery  radar  to  detect  artillery  fire 
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^  D0J3IRECTJTRE  Execute  direct  fire  events  and  update  ammunition  status 

^  DOJNDIRECT_FIRE  -  Execute  indirect  fire  events  and  update  ammunition  status 

8)  IMPACT_EFFECTS  —  Calculate  results  of  round  impacting 

9)  UPDATE_HEAT_STATUS  —  Update  unit’s  heat  status 

10)  UPDATE_CHEMICAL_STATUS  —  Update  unit’s  chemical  status 

1 1)  DISPLAY  -  Update  the  graphics  display 

12)  WRITE_STATUS  -  Periodic  activity  to  write  units  status  to  disk 

13)  CHECK_POINT  —  Activity  to  record  automatic  checkpoints 

14)  END_SIMULATION  —  Activity  to  end  the  simulation 

We  tried  to  make  the  actions  of  the  new  event  handlers  independent  and  orthogonal. 
Independent  means  that  one  event  handler  does  not  invoke  or  depend  on  the  action  of 
another.  Orthogonal  means  that  the  purpose  of  one  event  handler  is  completely  separate 
from  that  of  another.  Although  our  architecture  does  not  completely  meet  these  goals,  it 
comes  much  closer  to  them  than  the  legacy  design  does.  We  believe  that  these  properties 
of  the  architecture  are  desirable  because  they  impose  a  partitioned  structure  on  the  system 
that  aids  future  enhancements  and  modifications.  If  an  enhancement  affects  only  one  kind 
of  event,  then  it  becomes  relatively  easy  to  isolate  the  affected  part  of  the  code.  If  suitable 
naming  conventions  are  followed,  relatively  low-tech  tool  support  will  be  adequate  for 
helping  system  maintainers  find  the  parts  of  the  code  that  must  be  understood  and 
modified  to  make  a  future  change  to  the  system. 


Figure  6.  The  event  class  hierarchy. 


Every  event  has  an  associated  simulation  object  in  the  new  architecture.  This 
associated  object  is  the  target  of  the  event.  Depending  on  the  subclass  to  which  an  event 
object  belongs,  the  “execute”  method  of  the  event  will  invoke  the  cotresponding  event 
handler  of  the  associated  simulation  object  (Figure  7).  The  simulation  object  superclass 
defines  the  interface  of  the  event  handlers  for  the  event  groups.  At  the  highest  level,  it 
provides  an  empty  body  as  the  default  implementation  for  the  event  handlers.  Events  are 
dispatched  to  the  appropriate  subclass.  If  there  is  something  more  specific  that  needs  to 
be  done  for  instances  of  the  subclass,  the  event  handler  of  the  subclass  overrides  the 
inherited  method  in  order  to  simulate  the  desired  behavior. 
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Figure  7.  The  simulation  object  class  hierarchy. 


The  above  architecture  enables  a  very  simple  realization  of  the  main  simulation  loop: 
initialization; 

while  not  jmpty (event _queue)  loop 
e  :=  remove _event(event_queue); 
e.execute(  ); 
end  loop; 
finalization; 

Note  that  this  same  code  is  used  to  handle  all  of  the  event  handlers,  including  those 
for  future  extensions  that  have  not  yet  been  designed.  Event  objects  with  associated 
simulation  objects  are  created  and  inserted  into  the  event  queue  by  the  initialization 
procedure,  the  constructors  of  simulation  objects,  and  the  actions  of  other  event  handlers. 

Depending  on  the  actual  event,  events  are  inserted  into  an  event  priority  queue  based  on 
time  and  priority. 

Our  newly  designed  architecture  eliminates  the  need  for  the  simulation  loop  to  know 
what  kind  of  object  it  is  handling.  Thus  when  adding  an  object  type  not  yet  designed,  the 
simulation  loop  does  not  require  additional  code  to  invoke  the  new  object’s  event 
handlers.  By  localizing  all  changes  to  the  newly  added  object  class,  our  architecture 
eliminates  the  possibility  of  introducing  errors  into  the  existing  parts  of  the  simulation. 

4.  DESIGN  VALIDATION  VIA  PROTOTYPING 

The  process  of  transforming  a  design  developed  using  the  functional  approach  into  an 
object-oriented  design  introduces  risks  of  unintentionally  altering  system  behavior.  In  the 
context  of  our  case  study,  the  resultant  object  oriented  architecture  and  the  new  event 
dispatching  control  structure  are  areas  of  high  risk  since  they  differ  significantly  from  the 


functional  design  of  the  legacy  software.  UML  provides  two  ways  to  model  behavior. 
One  is  to  capture  the  behavior  of  individual  objects  over  time  using  state  machines,  and 
the  other  is  to  capture  the  interactions  of  a  set  of  objects  in  the  system  using  sequence 
diagrams  and  collaboration  diagrams.  While  state  machines  are  precise,  they  only  focus 
on  a  single  object  at  a  time  and  is  hard  to  understand  the  behavior  of  the  system  as  a 
whole.  The  sequence  diagrams  and  the  collaboration  diagrams,  on  the  other  hand,  lack  a 
formal  semantics  for  precise  description  of  the  system  behaviors. 

One  way  to  reduce  the  risk  is  to  validate  the  dynamic  behavior  of  the  proposed 
architecture  and  to  refine  the  interfaces  of  subsystems  via  prototyping  at  the  early  design 
stage.  To  be  effective,  prototypes  must  be  constructed  and  modified  rapidly,  accurately, 
and  cheaply.  Computer  aid  for  constructing  and  modifying  prototypes  makes  this  feasible 
[15],  The  CAPS  system  is  an  integrated  set  of  software  tools  that  generate  source 
programs  directly  from  high-level  requirement  specifications. 

Due  to  time  and  resource  limitations,  we  developed  a  prototype  for  only  a  very  small 
simulation  run,  which  consists  of  a  single  object  (a  tank)  moving  on  a  two-dimensional 
plane,  three  event  subclasses  (wove,  do _plan ,  and  endpimulation ),  and  one  kind  of  post¬ 
processing  statistics  (fuel  consumption). 

We  developed  an  executable  prototype  using  CAPS.  Figure  8  shows  the  top-level 
structure  of  the  prototype,  which  has  four  subsystems:  jcinns ,  gui ,  jaaws  and  the 
postprocessor.  Among  these  four  subsystems,  the  janus  and  the  gui  subsystems 
(depicted  as  double  circles)  are  made  up  of  sub-modules  as  shown  in  Figures  9  and  10, 
while  the  jaaws  and  the  post processor  subsystems  (depicted  as  single  circles)  are 
mapped  directly  to  modules  in  the  target  language.  After  entering  the  prototype  design 


into  CAPS,  we  used  the  CAPS  execution  support  system  to  generate  the  code  that 
interconnects  and  controls  these  subsystems.  In  addition,  a  simple  user  interface  was 
developed  using  CAPS/TAE  [22]  (Figure  11).  The  resultant  prototype  has  over  6000 
lines  of  program  source  code,  most  of  which  was  automatically  generated,  and  contains 
enough  features  to  exercise  all  parts  of  the  architecture.  The  code  that  handles  the  motion 
of  a  generic  simulation  object  was  very  simple,  but  it  was  designed  so  that  it  would  work 
in  both  two  and  three  dimensions  without  modification  (currently  the  initialization  and 
the  movement  plan  of  the  tank  object  never  call  for  any  vertical  motion).  The  code  was 
also  designed  to  be  polymorphic,  just  as  was  the  main  event  loop.  This  means  the  same 
code  will  handle  the  motion  of  all  kinds  of  simulation  objects  without  any  modifications, 
including  new  types  of  simulation  objects  that  are  part  of  currently  unknown  future 
enhancements  to  Janus  and  have  not  yet  been  designed  or  implemented. 
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Figure  9.  The  JANUS  subsystem  of  the  executable  prototype. 


Figure  1 0.  The  GUI  subsystem  of  the  executable  prototype. 
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Figure  1 1 .  The  Graphical  User  Interface  of  the  executable  prototype 

Our  prototyping  experiment  showed  that  the  proposed  object-oriented  architecture 
allows  design  issues  to  be  localized  and  provides  easy  means  for  future  extensions.  We 
started  out  with  a  prototype  consisting  of  only  two  event  subclasses  (move  and 
endjimidation)  and  were  able  to  add  a  third  event  subclass  (do _plan )  to  the  prototype 
without  modifying  the  event  control  loop  of  the  Janus  combat  simulator. 

We  also  demonstrated  the  use  of  inheritance  and  polymorphism  to  efficiently 
extend/specialize  the  behavior  of  combat  units.  For  example,  the  move_update_object 
method  of  a  tank  subclass  uses  the  general-purpose  method  from  its  superclass  to 
compute  its  distance  traveled  and  a  specialized  algorithm  to  compute  its  fuel 
consumption.  We  simply  include  one  statement  to  invoke  the  move_update_object 
method  of  its  superclass  followed  by  three  lines  of  code  to  update  its  fuel  consumption. 
Moreover,  other  combat  unit  subclasses  can  be  added  easily  to  the  prototype  without  the 
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need  to  modify  the  event  scheduling/dispatching  code  and  usually  without  modifying 
existing  event  handlers. 

The  issues  raised  by  the  design  of  the  prototype  also  resulted  in  the  following 
refinements  to  the  proposed  architecture: 

1.  Extend  the  interface  of  the  Execute JEvent  operation  to  return  the  time  at  which  the 
next  event  is  to  be  scheduled  for  the  same  simulation  object,  and  introduce  a  special 
time  value  NEVER  to  indicate  that  no  next  event  is  needed.  The  proposed  change 
turns  the  communication  between  the  event  dispatcher  and  the  simulation  objects 
from  a  peer-to-peer  communication  into  a  client-server  communication.  This  change 
eliminates  dependencies  of  event  handlers  on  event  queue  details  and  allows  the  event 
dispatcher  to  use  a  single  statement  to  schedule  all  recurring  events  for  all  event 
types. 

2.  Instead  of  recording  the  history  of  a  simulation  run  in  sets  of  data  files,  model  the 
simulation  history  as  a  sequence  of  events.  The  proposed  change  provides  a  simple 
and  uniform  way  to  handle  history  records  for  all  events,  and  allows  the  same 
modular  architecture  to  be  used  for  real-time  simulations  as  well  as  post-simulation 
analysis.  It  also  eliminates  the  need  for  the  write-status  event,  reducing  the  number  of 
events  still  further.  This  approach  provides  the  greatest  possible  resolution  for  the 
event  histories,  which  implies  that  any  quantity  that  could  have  been  calculated 
during  the  simulation  can  also  be  calculated  by  a  post-simulation  analysis  of  the  event 
history,  without  any  loss  of  accuracy.  The  only  constraint  imposed  by  this  design 
refinement  is  that  the  simulation  objects  in  the  events  must  be  copied  before  being 
included  in  the  simulation  history,  to  protect  them  from  further  changes  of  state  as  the 
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simulation  proceeds.  This  constraint  is  easy  to  meet  in  a  full-scale  implementation 

because  the  process  of  writing  the  contents  of  an  event  object  to  a  history  file  will 
implicitly  make  the  required  copy. 

The  prototyping  effort  also  exposed  a  design  issue  -  should  null  events  appear  in  the 
event  queue?  A  null  event  is  one  that  does  not  affect  the  state  of  the  simulation,  such  as  a 
move  event  for  an  object  that  is  currently  stationary.  The  prototype  version  adopted  the 
position  that  such  events  should  not  be  put  in  the  event  queue,  since  this  corresponds  to 
current  scheduling  policies  in  Janus,  and  appears  at  first  glance  to  improve  efficiency. 

Our  experience  with  the  development  of  the  prototype  suggests  that  this  decision 
complicates  the  logic  and  may  not  in  fact  improve  efficiency.  In  particular,  the  process 
create _neM_events  (see  Figure  9)  could  be  eliminated  if  we  allowed  null  events.  This 
process  scans  all  simulation  objects  once  per  simulation  cycle  to  determine  if  any 
dormant  objects  have  become  active,  and  if  so,  schedules  events  to  handle  their  new 
activity.  The  alternative  is  to  have  the  constructor  of  each  kind  of  simulation  object 
schedule  all  of  its  initial  events,  and  to  have  each  event  handler  specify  the  time  of  next 
instance  of  the  same  event  even  if  there  is  nothing  for  it  to  do  currently.  Handlers  might 
still  set  the  time  of  its  next  event  to  NEVER  in  the  case  of  a  catastrophic  kill;  however 
this  is  reasonable  only  if  it  is  impossible  to  repair  or  restore  the  operation  of  the  units  that 
have  suffered  a  catastrophic  kill.  The  reasons  why  this  design  change  may  improve 
efficiency  in  addition  to  simplifying  the  code  are  that: 

1.  the  check  for  whether  a  dormant  object  has  become  active  is  done  less  often  -  once 
per  activity  of  that  object,  rather  than  once  per  simulation  cycle, 
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2.  executing  a  null  event  is  very  fast  -  a  few  instructions  at  most,  so  the  “unnecessaiy” 
null  events  will  not  have  much  impact  on  execution  time,  and 

3.  the  computation  to  find  and  test  all  simulation  objects  periodically  would  be 
eliminated. 

We  recommend  allowing  null  events  in  the  event  queue,  and  explicitly  scheduling 
every  kind  of  event  for  every  object  unless  it  is  known  that  there  cannot  be  any  non¬ 
empty  events  of  that  type  in  any  possible  future  state  of  the  object.  For  example,  under 
the  proposed  scheduling  policy,  immobile  or  irrecoverably  damaged  objects  would  not 
need  to  schedule  future  move  events,  but  those  that  are  currently  at  their  planned 
positions  would  need  to  do  so,  because  a  change  of  plan  could  cause  them  to  move  again 

m  the  future,  even  though  they  are  not  currently  moving.  The  resulting  architecture 
enables  a  very  simple  realization  of  the  main  simulation. 

5.  CONCLUSION 

Our  conclusion  is  that  substantial  and  useful  computer  aid  for  re-engineering  is  possible 
at  the  current  state  of  the  art.  Human  analysts  and  domain  experts  must  also  play  an 
important  part  of  the  process  because  much  of  the  information  needed  to  do  a  good  job  is 
not  present  in  the  software  artifacts  to  be  re-engineered.  Success  depends  on  cooperation 
between  skilled  people  and  appropriate  software  tools. 

The  missing  information  needed  for  re-engineering  is  related  to  deficiencies  of  the 
current  system  at  all  levels,  from  requirements  through  design  and  implementation. 
Thorough  and  accurate  knowledge  of  these  deficiencies  is  crucial  for  success.  The  clients 
never  want  the  re-engineered  system  to  have  the  exactly  same  behavior  as  the  legacy 
system  -  if  they  were  satisfied,  there  would  be  little  motivation  to  spend  time,  effort,  and 


resources  on  a  re-engineering  project.  Even  if  a  system  is  being  re-engineered  for  the 
ostensible  goal  of  porting  to  different  hardware,  the  desired  behavior  at  the  interface  to 
the  hardware  and  systems  software  will  be  different. 

In  practical  situations,  the  requirements  for  the  re-engineered  system  are  different 
from  those  for  the  legacy  system.  Key  parts  of  the  requirements  for  the  new  system  are 
often  missing  or  incorrect  in  the  legacy  documents.  Some  of  that  information  is  present 
only  in  the  minds  of  the  clients,  often  fragmented  and  scattered  across  members  of  many 
different  organizations.  Communication  is  a  large  part  of  the  process,  and  that 
communication  cannot  be  automated  away,  although  it  can  be  enhanced  by  appropriate 
use  of  prototyping.  We  found  that  the  most  important  communications  were  those 
regarding  newly  recognized  requirements  issues,  and  that  such  recognition  were  often 
triggered  b\'  discussions  between  people  with  different  areas  of  expertise. 

Uncertainties  about  the  true  requirements  play  a  central  role  in  both  re-engineering 
and  the  development  of  new  systems.  We  therefore  hypothesized  that  prototyping  could 
play  a  valuable  role  in  re-engineering  efforts.  Our  experience  in  the  case  study  reported 
here  support  that  hypothesis. 

We  also  found  that  prototyping  can  contribute  substantially  to  the  process  of 
inventing,  correcting,  and  refining  the  conceptual  structures  on  which  the  architecture  of 

the  new  system  will  be  based.  Most  legacy  systems  are  too  complicated  for  individuals  to 
understand. 

This  maze  of  details  hides  potential  opportunities  for  simplifying  and  regularizing  the 
conceptual  structure  of  the  system  to  be  re-engineered,  and  makes  it  difficult  to  recognize 
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deficiencies  in  design  and  architectural  structure.  The  amplification  process  implicit  in 
constructing  skeletal  prototypes  helps  expose  such  opportunities. 

We  found  that  there  are  fundamental  conceptual  errors  embodied  in  the  legacy 
structures  and  algorithms.  Some  of  those  errors  were  exposed  when  structural 
asymmetries  and  irregularities  are  discovered  in  the  process  of  extracting  a  model  of  the 
legacy  software.  Others  were  discovered  only  with  the  help  of  the  oversimplified  models 
that  are  common  in  the  early  stages  of  prototyping  a  proposed  new  architecture. 
Constructing  a  small  and  simple  instance  of  the  proposed  architecture  raises  many  of  the 
main  design  issues,  and  the  simplicity  of  the  model  makes  it  much  easier  to  consider  and 
evaluate  alternative  designs  to  find  improved  structures. 

To  be  effective,  prototypes  must  be  constructed  and  modified  rapidly,  accurately,  and 
cheaply.  The  UML  interaction  diagrams  lack  the  preciseness  to  support  automatic  code 
generation  for  the  executable  prototype.  This  weakness  can  be  remedied  by  the  use  of  the 
prototype  language  PSDL  [12,  13]  and  the  CAPS  prototyping  environment,  which 
provide  effective  means  to  model  the  system’s  dynamic  behavior  in  a  form  that  can  be 
easily  validated  by  user  via  prototype  demonstration. 
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ABSTRACT 

A  product  line  is  a  group  of  systems  sharing  a  common, 
managed  set  of  features  that  satisfy  specific  needs  of  a 
selected  market  or  mission.  In  the  product  line  approach, 
management,  system  developers,  and  a  reuse  team  are 
interested  in  some  views  of  the  product  line.  In  this  paper 
a  model  is  defined  to  present  product  lines,  its  derived 
products,  and  common  assets  used  in  these  product  lines. 
The  model  is  used  to  convey  views  of  interest  to  different 
stakeholders:  management,  system  developers,  and  a 
reuse  team  in  the  product  line  approach.  Its  purpose  is  to 
capture  information  and  present  this  information  about 
organizations'  product  lines,  and  make  it  visible  to  the 
stakeholders  inside  and  outside  organizations. 

Management  can  use  the  model  when  producing  new 
products  of  a  product  line,  negotiating  with  customers, 
and  assessing  the  benefits  of  adopting  the  product  line 
approach.  Product  line  developers  can  use  the  model 
when  developing  products  of  a  product  line.  A  reuse  team 
can  use  the  model  through  asset  identifications,  ensurina  a 
successful  use  of  asset  base  in  and  across  product  lines, 
and  assessing  the  level  of  reuse. 

Keywords 

Product  line.  Product  line  architecture.  COTS. 
Organizational  components.  Stakeholders,  and  System- 
unique  components. 

1  INTRODUCTION 

Organizations  that  develop  similar  products  are  adopting 
the  product  line  or  product  family  approach  to  deploy 
systems  faster,  at  a  low  cost,  and  a  high  quality.  Systems 
are  produced  in  a  product  line  using  common  architecture 
and  assets  that  are  used  across  products.  Organizations 
reuse  common  assets,  integrated  assets,  etc.  that  would 


otherwise  have  to  be  needlessly  repeated  for  each  system. 

Each  stakeholder,  i.e.  management,  systems  developers, 
and  reuse  team  is  interested  in  a  particular  view  of  the 
product  line.  Management,  for  example,  might  be 
interested  in  viewing  products  of  a  product  line  to 
estimate  time  and  schedules.  Systems  developers  might 
be  interested  in  a  view  of  a  product  line  looking  for 
common  assets.  The  reuse  team  might  be  interested  in  a 
view  of  a  product  line  to  assess  the  level  of  reuse  in  a 
product  line.  These  are  some  of  the  interesting  views. 

We  are  presenting  a  product  line  viewpoint  model  that 
shows  different  views  of  the  product  line,  its  derived 
products,  and  common  assets  used.  Also  we  are  showing 
how  the  model  conveys  particular  views  interesting  to 
management,  systems  developers,  and  reuse  team. 

Section  2  describes  the  product  line  concept.  Section  3 
describes  the  product  line  model.  Section  4  describes 
views  captured  by  the  model.  Section  5  is  an  empirical 
model  for  product  line  validation.  Section  6  represents  a 
repository  support.  Section  7  is  the  conclusion. 

2  PRODUCT  LINE  CONCEPT 
A  product  line  is  defined  as  a  group  of  products  sharing  a 
common,  managed  set  of  features  that  satisfy  specific 
needs  of  a  selected  market  or  mission  [1,  4],  Products  in 
the  product  line  are  engineered  through  customization 
from  base  requirements  and  standard  product  line 
architectures,  and  integration  of  common  components 
rather  than  using  system-unique  software  [2]. 

The  product  line  architecture  is  one  of  the  important 
assets  shared  by  the  systems  in  a  product  line.  It  provides 
the  structure  for  building  systems  in  the  product  line.  All 
products  are  based  on  the  product  line  architecture. 

Product  line  assets  are  used  across  products  in  the  product 
line.  Product  line  assets  depend  on  the  solutions  common 
to  the  products  in  a  product  line.  Reusing  these  solutions 
reduces  or  eliminates  work  that  otherwise  would  be 
required  to  build  each  product  [3]. 

In  the  product  line  development,  a  dual  life-cycle  model 
can  be  used  in  which  domain  engineering  is  the  process 
used  to  create  domain  artifacts  useful  across  the  entire 
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Viewpoint  Template 


Table  1  The  Viewpoint  and  Attribute  Template 


Product  line 

architecture  release 


Product  release 


jjame,  contact  person,  customers 


Contact  person,  release  number,  number  of  times  reused  development 

commnUmber  0f  staff'  used  architectural  style,  inter-component  used 
.communication  mechanisms,  operating  systems(s)and  platform^ 

J:"S“mery  release  number,  contact  person,  development  time 

svstem^r"  |C<JSt'r  When  deve,oped-  number  of  staff,  status,  operating 
sybtem(s)  and  platiormLO  y  ° 


COTS 

release 


component 


Organizational 
component  release 


w ,  mtu  piuuurmtsj. 

''endor-  release  number,  contact  person,  cost,  number  of  times 
_reused,  operating  system(s)  and  platform (s) 

Name  relpn?^  n  nm _ _  _  : - 


System-unique 
component  release 


upctaung  svstemts)  and  platform(s) 

number-  contac'  Pers°n.  de\  eloped  internally  or  externally 

staff  onTrnr'  C°SI'  nUmber  of  tlmes  reused-  development  time,  number  of 
staff  operating  system(s)  and  platformtsi 

Nnillf1'  rhi!  J-irn  _  “  “ 


ana  piatrormts). _ 

::aime;  please  number,  contact  person,  development  cost,  development 
ume.  and  number  ot  staff,  operating  svstemts)  and  nlatform(s). 


Figure  1  Product  Line  Mews  Model 


product  line,  and  application  engineering  is  the  process 
a  si"sle  prod“ct  by  adap,ins  ,he  domain- 

3  PRODUCT  LINE  VIEWS  MODEL  x 

A  product  line  model  that  shows  different  views  of  a 
product  line ,  its  derived  products,  and  common  assets  used 
is  presented  in  this  section.  It  defines  entities  and 

”  presneSnhiPSdiSVee?  t0  preSent  Product  lines 

It  presents  different  ways  to  viewing  a  product  line 
keeping  m  mind  enhancement,  modification,  other  models 
other  entmes  and  relationships.  Figure  1  depicts  the  modd 

SeCt"'"S  de“rib'  the  pr0duct  line  ^ 
3.1*  Product  Line  Overview 

A  product  line  is  defined  as  a  group  of  products  sharing  a 
common,  m  anaged  set  of  features  that  satisfy  specific nee°ds 

-rouDeofCDmHmarket  ^  m'SSi°n  [1*  4]'  A  Product  <'ne  has  a 
Oroup  of  products  associated  with  it;  it  has  a  1:M  relation 

with  its  products.  A  product  line  has  a  common  architecture 

associated  with  it;  it  has  a  1 : 1  relation  with  its  architecture. 

3.2.  Product  Line  Architecture 
Product  line  architecture  provides  the  structural  elements 
d  their  interfaces  by  which  the  system  is  composed  out  of 
the  product  line  [18].  Products  are  customized  usino  the 
product  line  architecture.  Product  line  architecture  mi»ht 

the  irod  H2  Pr°dUCt  Hne  llTe  Cyc,e-  New  releases  of 
the  product  line  architecture  could  be  seen  and  this  is  due  to 

§Vn  customers’  requirements,  new  technologies 

TheSearlveS’|etC'  hfS  3  l:M  reIationshiP  with  its  releases.’ 
The  early  releases  of  product  line  architecture  specify  the 

thevmcourPOnefntS  T*  ^  th£  Pr°dUCt  line  arehitecture; 
they  could  specify  the  functionality  needed  by  these 

components  and  might  specify  their  interfaces.  An  M-  N 

e leaas°enainPcS  ******  betWee"  pr°duCt  ,ine  architecture 
lease  and  common  component  description/interface.  After 

or",-00"150"6""  3re  devel°Ped’  later  releases  of 

Z  AeSiS™'  re,MSeS:  *  a  1  :M 

3.  3.  Products 

customitat'"  %  pr°l!UCt  Iine  are  enSin^red  through 
customization  from  base  requirements,  standard  product 

and  St  ^  integration  of  common  components, 
and  mieht  use  system  unique  components.  Each  product  is 

associated  with  its  releases.  Each  product  release  has 
architecture  associated  with  it  called  product  release 
^chitecture.  Product  has  a  1:M  relationship  with  its 

arch1tectuTe.ereaS>  Pr°dUCt  ^  h3$  3  ' : '  re,ation  with 


3.  4.  Product  Release  Architecture 

Product  release  architecture  is  derived  from  the  product  line 

architecture  release  and  must  conform  to  the  product  line 

d£h  r^i!  r<rlease-  Ir  uses  many  common  components 
described  by  the  product  line  architecture  release;  for  each 

common  component  used,  it  uses  one  of  the  releases  of  that 
component.  In  addition,  it  might  use  many  system-unique 
components;  for  each  system-unique  component  usedq  it 
uses  a  release  of  that  component. 

3.5.  Components 

Components  are  the  building  blocks  of  products  in  a 
product  line  and  are  classified  into  two  categories-  a 
common  component  and  system-unique  component  A 
common  component  is  used  across  products  of  a  product 
line  and  could  be  a  commercial-off-the-shelf  (COTS) 
component  or  an  organizational  component.  Organizational 
components  refer  to  common  components  developed  bv  the 

intern nH  T  ,°rganization-  They  could  be  developed 

2k  by  1 ie,°f^ani2atlon  ownin§  the  product  line  or 
externa! K  bj,  a  different  organization  within  the  business 

unit  of  "hich  the  organization  is  a  part.  A  system-unique 
component  is  used  in  specific  products.  Both  types  of 
components,  common  and  system-unique,  could  have 

IvtT  aSSr,ated  W‘th  them  and  have  a  1:M  relationship 
*hhe'r  re  eafSx  The>  are  used  in  many  product  releases 
release  ^  M:  N  reIationshiP  vvith  P^duct  architecture 

3.  6.  Viewpoint  Attributes 

Entities  in  the  viewpoints  have  some  interesting  attributes, 
able  1  represents  the  viewpoint  and  attributes  template, 
rganizations  that  adopt  the  product  line  approach  mi«ht 
be  interested  in  other  attributes;  these  attributes  can  “be 
added  to  the  table.  The  attributes  listed  in  table  1  are  used 
to  support  the  views  described  at  section  4  in  this  paper. 

4  PRODUCT  LINE  VIEWPOINT  MODEL 

n  the  product  line  approach,  product  Lines  share  several 
different  views  that  are  interesting  to  management,  system 

be  possib^  a°d  3  reuse  team-  Other  interesting  views  might 
4.1.  Management  View 

Management  of  an  organization  that  adopts  the  product  line 
approach  has  authority,  vision,  and  leadership.  It  manages 
the  development  of  products  in  a  product  line.  They 
manage  staffing,  training,  cost,  directions,  and  schedules 
through  the  pr^uct  line  cycle.  They  have  a  clear  vision 
a  ou  t  e  direction  of  a  product  line.  They  interact  with 
customers  and  make  business  decisions. 

Management  in  the  product  line  approach  can  be  interested 
in  the  products  derived  from  a  product  line,  customers  of 
these  products,  and  customer  contact  persons.  Also  they 
can  be  interested  in  cost,  contact  persons,  time  intervals, 
and  staffing  for  products  and  assets  used  in  these  products. 
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Table  2.  Experimental  Model  Phases  for  Product  Line  Validation 


Phase 

b  unction 

Data  Collection  Methods 

!•  Adoption 

Assessment 

Survey,  Legacy 

L.  Hanning  &  Management 

Mcusurment  &  Control 

Survey,  Legacy 

o.  utilization 

'  Monitnng 

T;ase  study,  Project  Monitor 

expansion 

Adaptation 

Case  Study,  Survey,  Legacy 

This  data  is  supported  by  the  mode.  Management  can  use 
this  data  when  producing  a  new  product  of  a  product  line, 
negotiating  with  customers,  and  assessing  the  benefits  of 
adopting  the  product  line  approach. 

The  structural  of  management  view  and  its  relationships 
presented  by  the  model  answers  questions  related  to  what 
are  the  products  of  a  product  line  and  assets  used  in  these 
products.  Attributes  used  in  model’s  entities  answer 
questions  related  to  who  is  the  customer,  contact  person, 
time  interval,  cost,  staffing,  etc.,  of  products  in  a  product 
line. 

4.2.  Reuse  Team  View 

A  reuse  team  of  an  organization  that  adopts  a  product  line 
approach  supports  reuse  across  product  lines.  They  support 
reuse  of  components  through  asset  identification.  With 
systems  developers  they  ensure  successful  use  of  asset 
bases  in  and  across  product  lines.  They  assess  the  reuse 
level  across  product  lines.  Reuse  team  can  be  interested  in 
viewing  product  lines,  their  derived  products,  and  reusable 
assets  (product  line  architectures  and  components)  used  in  a 
product  line.  They  can  also  be  interested  in  the  number  of 
times  an  asset  is  reused,  and  the  type  of  components  used  in 
a  product  line. 

The  structural  of  reuse  team  view  and  its  relations 
presented  by  the  model  shows  products  of  a  product  line 
and  assets  used  in  these  products.  Attributes  used  in  the 
model’s  entities  answer  question  related  to  the  type  of 


components  used,  number  of  times  an  asset  is  reused. 

The  reuse  team  can  use  this  information  through  asset 
identification,  ensuring  a  successful  use  of  asset  base  in  and 
across  product  lines,  and  assessing  the  level  of  reuse. 

4.3.  Systems  Developers  View 

System  developers  in  the  product  line  approach  are  also 
interested  in  viewing  product  lines,  their  derived  products, 
the  product  line  architecture,  its  evolution,  assets  used  and 
their  evolution,  the  operating  system(s)  and  platform(s)  are 
used,  components  types,  their  interfaces. 

The  structural  of  system  developers  view  and  its  relations 
presented  by  the  model  shows  the  products  derived  in  a 
product  line,  the  product  line  architecture,  its  evolution, 
components  used  and  their  evolution.  Attributes  used  in  the 
model’s  entities  answer  questions  related  the  contact  person 
of  an  asset,  components  interface,  component  type, 
operating  system(s)  and  platform(s). 

4.4.  Viewpoints  Development 

We  used  the  method  called  VORD  [17]  for  the 
development  of  viewpoints.  Also,  this  method  is  principally 
intended  for  requirements  discovery  and  analysis,  it 
includes  steps  that  help  to  translate  this  analysis  into  a 
viewpoint.  We  considered  only  the  first  three  stages  of  the 
VORD  method  concerned  with  viewpoint  identification, 
structuring,  and  documentation. 

a-  Viewpoint  Identification  involves  discovering 
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stakeholder  viewpoint  and  identifying  the  specific 
attributes,  tasks,  and  sub-viewpoints. 

b-  Viewpoint  Structuring  involves  grouping  related 
viewpoints  into  a  hierarchy.  Common  viewpoints  are 
provided  at  higher  levels  in  the  hierarchy  and  are 
inherited  by  lower-level  viewpoints. 

c-  Viewpoints  documentation  involves  refining  the 
description  of  the  identified  viewpoints. 

Viewpoints  and  attributes  information  in  VORD  are 
collected  using  standard  forms.  The  form  used  for 
viewpoint  information  (the  viewpoint  template)  and 
attributes  information  (attributes  template)  are  shown  in 
Table  1. 

The  viewpoints  and  attributes  templates,  as  well  as  the 
viewpoint  hierarchy  diagrams  are  developed  during  the 
three  phases  shown  in  Table  1.  The  templates  are  used  to 
structure  the  information  collected,  and  in  general  a 
template  cannot  be  completely  filled  in  during  single 
activity.  " 

5  empirical  model  for  product  line 

VALIDATION 

In  this  section  an  experimental  integrated  model  for 
product  line  pilot  project  planning,  measurement,  and 
assessment  is  presented.  This  section  discusses  how 
qualitative  and  quantitative  process  and  product  line  goals 
are  established  based  on  customer  and  business  needs  WThe 
process  of  flow-down  of  goals  to  the  level  of  processes  and 
the  experimental  pilot  model  is  described.  Table  2.  presents 
the  empirical  and  engineering  model  phases  for  product 
line  validation. 

5.1.  Making  the  Product  Line  Adoption  Decision 
Product  line  adoption  is  defined  in  the  context  of  an 
organization  rationale  to  agree,  sponsor,  commit,  or 
allocate  resources  for  initiating  a  product  line  plan  or 
project.  Product  line  utilization  is  defined  in  the  context  of 
an  organization  as  the  creation  of  assets  with  the  specific 
“intention”  to  be  reused  as  well  as  the  utilization  of  assets 
that  had  been  specifically  created  with  the  -intention”  of 
being  reused.  Product  line  management  is  defined  in  the 
context  of  an  organization  that  manages  the  creation, 
utilization,  and  evolution  (i.e.,  maintenance)  of  reusable 
assets.  The  application  of  software  reuse  technologies  to 
planned  products  (both  new  and  existing)  and  planned 
product  lines  is  an  indicator  that  software  reuse  adoption  is 
strongly  correlated  with  organizational  opportunities. 

Most  software  development  organizations  operate 
according  to  marketing  and  finance  strategies.  An 
organization  wishing  to  improve  its  financial  status  may 
look  for  new  or  extended  opportunities  in  software  product 
markets.  Product  line  is  one  possible  approach  that  may  be 
used  to  leverage  decreased  time  to  such  markets  with 
decreased  effort  and  increased  product  quality. 


So  the  first  step  is  to  make  the  product  line  adoption 
decision  based  on  some  empirically  validated  software 
reuse  reference  model  (RRM)  [Nada  97].  This  in  turn  will 
lead  to  a  set  of  decisions  balancing  market  opportunities 
with  market  risks.  This  step  will  also  identify  reuse 
opportunities,  reuse  objectives,  costs,  constraints,  and 
options. 

For  adoption  decision  organizations  conduct  an  analytical 
study  to  decide  either  to  adopt  certain  product  line  process 
or  technology  or  not.  This  study,  collects  both  qualitative 
and  quantitative  benchmark  data  on  the  product  line 
approach. 

The  adoption  phase  includes  several  steps  to  evaluate  the 
technical  and  organizational  aspects  of  the  introduced 
product  line  process  or  technology. 

5.  /.  1  Organization  context 

Organization  context  describes  the  environment  in  which 
the  organization  exists  or  existed  when  it  launched  the 
product  line  effort.  The  following  lists  common  factors  that 
are  used  in  the  adoption  phase  to  evaluate  the  existing 
environment  before  applying  the  product  line  approach! 
The  following  factors  will  be  used  to  record  and  evaluate 
the  context  environment  of  organizations  adopted  the 
product  line  approach.  Also  it  used  by  organizations 
exploring  the  transition  to  the  product  line  approach. 

Process  or  technology  objective.  To  adopt  the  product 
line  approach;  the  objective  of  developing  product  lines 
needs  to  be  addressed  and  defined.  This  includes  defining 
the  scope  of  the  product  line,  how  long  the  organization  has 
been  building  product  lines,  and  the  product  line  life  cycle. 

Costs/ben efits.  Organizations  that  already  adopted  these 
processes  or  technologies  should  have  data  related  to  the 
costs  and  benefits  of  this  adopting.  Organizations  that  are 
thinking  to  adopt  a  software  reuse  approach  might  not  have 
data  about  the  cost  of  adopting  this  technology,  but  the 
benefits  of  software  reuse  approach  should  be  defined. 
Cost  varies  based  on  the  size  and  the  number  of  products  in 
the  organization,  the  technical  experience,  organization 
structure  needed,  skills  and  training,  and  tools. 

Commonalties  and  variabilities.  Organizations  exploring 
the  transition  to  software  reuse  approach  should  identify 
which  products  can  be  considered  and  what  their 
commonalties  and  variabilities 

Common  architecture.  Organizations  exploring  software 
reuse  approach  should  consider  the  feasibility  of  common 
architecture  for  their  products.  Also  the  style  of  the 
architecture  might  be  defined,  e.g.  layered  architecture, 
client  server  architecture,  etc. 

Assets  used.  In  software  reuse  development  approach 
products  are  assembled  using  common  set  of  assets  and 
might  use  system  unique  assets.  Assets  could  be  domain 
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models,  communication  protocol  descriptions,  user 
interface  descriptions,  code  components,  type  of  common 
components  that  developed  internally  or  by  using  Off-The- 
Shelf  “COTS”  components,  application  generators,  domain 
knowledge,  test  plans  and  procedures,  requirement 
descriptions,  performance  models,  metrics,  etc. 
Organizations  adopted  the  software  reuse  approach  records 
the  common  assets  used  in  their  products.  Organizations 
exploring  the  transition  to  the  product  line  approach  should 
define  what  are  the  common  assets  exist. 

Level  of  reuse.  One  of  the  benefits  of  adopting  software 
reuse  is  increasing  the  level  of  software  assets  reuse  in 
organizations.  Organizations  adopting  reuse  approach 
should  have  or  find  other  organizations  data  related  to  the 
percentage  of  reuse  achieved  in  adopting  the  this  approach. 
Also  the  type  of  reuse  used,  for  example,  horizontal  reuse 
or  vertical  reuse.  Horizontal  reuse  represents  wide  domain 
width  reuse,  i.e.  a  component  that  can  be  used  in  many 
applications.  Vertical  reuse  represents  a  narrow  domain 
width  reuse,  i.e.  a  component  that  can  be  used  in  one 
application. 

Organization  structure.  The  organization's  structure  for 
developing  one-at-a-time  systems  might  not  be  suitable  to 
product  line  development.  Adopting  a  product  line 
approach  has  an  impact  on  organization  structure.  This 
factor  defines  the  impact  of  the  new  structure  needed  to 
adopt  the  product  line  approach.  The  impact  might  be  low, 
medium,  or  high. 

Process.  Process  used  in  developing  one-at-a-time  systems 
will  not  be  suitable  to  the  product  line  development.  As 
part  of  adopting  reuse  technology,  existing  process  might 
be  modified  and  new  processes  need  to  be  in  place,  e.g. 
customer  interface  process,  software  development 
processes,  etc.  This  factor  defines  the  impact  on  the 
organization  processes  by  adopting  new  approach,  what 
type  of  the  processes  need  to  be  changed,  and  what  type  of 
new  processes  needed. 

Training.  Transitioning  to  new  processes  or  technology 
requires  skilled  personnel  to  achieve  a  successful 
transitioning.  This  factor  defines  the  type  of  training 
needed,  e.g.  in  house  training,  external  consultant,  etc.  Also 
it  defines  who  needs  training,  e.g.  management,  systems 
developers,  etc. 

Tools.  This  factor  defines  which  tools  are  needed  in 
software  development,  e.g.  tools  to  assemble  products, 
configuration  management  tools,  tools  to  record  the 
progress  of  the  product  line  development,  etc. 

Software  reuse  assessment  is  the  main  function  of  this 
phase.  Historical  methods  are  used  to  collect  data,  e.g., 
survey  and/or  legacy 

5.2.  Product  Line  Planning 

Organizations  use  this  phase  as  a  plan  for  the  transition  to 


product  line  software  development  approach.  Organizations 
can  use  this  phase  to  record,  evaluate,  and  assess  the 
planning  for  the  product  line  approach.  Organizations 
intending  to  adopt  software  reuse  use  this  phase  to  put  the 
software  reuse  in  practice. 

The  following  include  the  implementation  plan  for  software 
reuse  approach;  a  list  of  common  factors  is  described  in  this 
section  as  part  of  the  planning  phase. 

Management  Support.  Building  software  products  is  not 
just  an  engineering  agenda,  it  precipitates  changes  in 
personnel,  personnel  management,  incentives,  customer 
interface,  scheduling,  budgeting,  and  a  whole  host  of 
management  practices.  It  is  a  new  vigorously  and  actively 
supports  the  transition,  the  effort  will  fail.  Software  reuse 
strategy  means  that  organizations  and  managers  have  less 
direct  control  over  their  product  developments  and 
increased  dependency  on  other  organizations  to  understand 
their  requirements  and  provide  acceptable  solutions.  Giving 
up  this  control  and  the  necessary  dollars  to  support  product 
line  technology  and  application  development  may  be 
difficult.  Organizations  adopted  the  software  reuse 
approach  should  record  their  experience  of  the  management 
support,  evaluate,  and  assess  that  support. 

Cultural  change.  The  software  reuse  concepts  should  be 
defined  and  understood  by  people  of  organizations 
adopting  this  new  approach.  A  particular  attitude  that  had 
to  be  overcome  was  the  one-at-a-time  mentality  of  building 
a  system  for  its  own  sake  rather  than  as  a  contributing 
effort  to  the  organization's  strategic  goal  of  fielding  and 
building  up  a  base  set  of  core  assets.  Software  reuse 
tenninology  should  be  defined  and  understood  across 
organization. 

Organization  structure.  Adopting  new  technology  or 
process  has  an  impact  on  the  organizational  structure.  For 
example  organizations  develop  product  line  has  a  structure 
different  than  organizations  develop  one-at-a-time  systems. 
Some  organizations  has  a  product  line  structure  where  a 
marketers  group  relate  product  line  capabilities  to 
prospective  customers;  relate  customer  needs  to  asset  and 
application  developers.  A  core  assets  group  develops 
architecture  and  other  assets  for  product  line.  An 
application  group  deliver  systems  to  customer.  There  are 
different  players  in  the  product  line  approach  and  they 
should  have  different  skills  to  launch  the  product  line 
approach.  Transitioning  to  the  product  line  approach 
requires  the  organization’s  structure  and  players  in  the 
product  line  approach  to  be  defined. 

Training  and  processes.  Transitioning  to  software  reuse 
involve  education  and  training  on  the  part  of  management 
and  technicians.  Managers  need  to  support  the  business 
motivation  and  strategy  of  the  software  reuse  approach. 
They  need  to  understand  and  role  of  the  infrastructure 
technologies,  understand  how  to  monitor  progress  and 
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Identify  potential  problems  within  their  area  of  the 
o,Diff're"'  0f  «“■»  ■»«*'  be  needed! 

cZiLsir*-  ‘job  m'n,orins  from 

New  processes  are  needed  to  develop  a  product  line  is 

svSmf  T°hm  Pr0C6SSeS  US£d  in  deve,°PinS  one-at-a-time 
systems.  These  processes  might  be  customer  interface 

processes,  development  process,  resource  ownership 
processes,  etc. 

Training  and  processes  changes  should  be  defined  in  the 
transition  to  the  product  line  approach. 

New  technologies.  Technologies  allow  organizations  to 
stay  a  competitive  edge.  Some  of  the  technologies  for 
example,  used  in  the  production  of  product  line  are  domain 
engineering  and  application  engineering.  Domain 
engineering  used  to  create  artifacts  useful  across  the  entire 
product  line.  Application  engineering  is  used  to  produce  a 
single  product  by  adopting  the  domain-wide  assets.  Other 
technologies,  for  example,  using  CORBA,  COM  etc 
These  technologies  need  to  be  defined  in  the  transitioning 
to  software  reuse  development  approach. 

Tools  support:  Using  tools  to  support  the  new 
development  approach  increase  organizations'  productivity 
Some  organizations  use  tools  that  are  used  to  assemble 
products  together.  Others  use  tool  to  capture  domain 
knowledge  etc.  These  type  of  tools  used  needs  to  be 
defined  in  the  transition  phase. 

Software  reuse  measurement  is  the  main  function  of  this 
phase.  Historical  methods  are  used  to  collect  data,  e.<* 
survey  and/or  legacy.  &  ’ 

3.3.  Utilization  and  Management 

Product  line  utilization  is  defined  in  the  context  of  an 
organization  as  the  creation  of  assets  with  the  specific 
intention  to  be  reused  and  the  utilization  of  assets  that 
reused  60  SpecificaIly  created  with  the  “intention”  of  being 

The  next  step  is  to  decide  upon  the  levels  of  the  RRM 
utilization  and  management  and  to  look  closely  at  any 
significant  changes  or  impacts  on  both  top  and  middle 
management.  This  step  includes  the  assessment  of  an 
organization’s  willingness  to  adopt  the  RRM  the 
implementation  levels,  and  the  incremental  investment 
strategies. 

5.3.  ]  The  Product  Line  Utilization . 

Asset  Utilization  The  objective  of  processes  in  this  family 
is  to  utilize  existing  assets  in  software  development  and 
evolution  (i.e.,  maintenance)  activities.  The  processes  for 
is  family  consist  of  developing  or  selecting  criteria  for 
asset  identification,  modifying  or  tailoring  selected  asset(s) 
and  integrating  the  selected  asset  in  the  system  under 
development  or  evolution 


This  step  is  the  actual  production  phase  by  applying 
evolutionary  approach  (Boehm  Spiral  Life-Cycle  Model  ) 
to  the  reuse  plan  implementation.  Our  early  research  results 
have  shown  that  software  development  organizations  at  a 
high  success  (capability)  level  usually  carry  out  several 
pilot  (experimental)  projects  to  help  them  in  the 
construction  of  a  prototype  repository,  component  model 
definition,  components  classification  scheme  definition 

folkiu"-  m°del’  C°mmon  archltecture>  and  product-line  as 

I.  Develop  a  prototype  (pilot  project) 

II.  Learn  and  evaluate  of  risk  versus  opportunities 

(including  assessment  of  effort,  quality,  schedule,  tools 
and  procedures) 

III.  Expand  prototype  to  a  safer  version  of  product  line 
with  the  necessary  adjustment 

Repeat  step  (II)  and  (II)  until  you  achieve  a  stable  product 
line  version. 

This  approach  to  the  successful  learning  and  evolving  the 
RRM  within  an  organization  is  like  the  Boehm  Spiral  Life- 
Cycle  Model  [8]  applied  to  the  RRM  implementation  plan. 

5.3.2  Product  Line  Management 

Reuse  management  is  defined  in  the  context  of  an 
organization  that  manages  the  creation,  utilization,  and 
evolution  (i.e..  maintenance)  of  reusable  assets. 

Asset  Management  and  Control:  The  objective  of  processes 
in  this  family  is  to  develop  and  organize  coliection(s)  of 
quality  reusable  assets,  define  and  develop  services  and 
capabilities  to  access  these  assets  (i.e.,  for  asset  utilization 
processes),  and  establish,  support,  and  enact  a  broker  role 
for  asset  developers  (i.e.,  from  asset  creation)  and  asset 
consumers  (i.e.,  from  asset  utilization). 

The  reuse  management  and  control  is  based  on  the  classic 
plan,  enact,  and  learn  cycle.  The  plan,  enact,  learn  cycle  in 
the  reuse  management  idiom  is  based  on  the  following 
principles  as  described  in  the  STARS  CFRP  [11], 

Software  reuse  monitoring  is  the  main  function  of  this 
phase.  Observational  and  historical  methods  are  used  to 
collect  data,  e.g.,  survey,  case  study,  historical  analyze 
and/or  legacy 

5.4.  Product  Line  Expansion 

In  this  phase,  organizations  look  for  new  product 
opportunities  and  asses  the  customer  needs  and  reuse 
future  plan. 

Determining  and  evolving  the  future  objectives,  strategy, 
and  scope  of  a  reuse  program,  resulting  in  selection  of  a  set 
of  suitable  domains  and  products  lines  in  which  to  apply 
reuse  within  an  organization.  Planning,  establishing, 
monitoring  and  evaluating  Reuse  engineering  idiom  (asset 


114. 


creation,  asset  management,  and  asset  utilization)  projects 
addressing  the  selected  domains  and  product  lines.  Looking 
for  new  market  opportunities,  market  analyze,  and  assess 
the  future  financial  plans. 

Software  reuse  adaptation  is  the  main  function  of  this 
phase.  Observational  and  historical  methods  are  used  to 
collect  data,  e.g.,  survey,  case  study,  historical  analysis 
and/or  legacy. 

6  REPOSITORY  SUPPORT 

Organizations  adopting  the  product  line  approach  can  use  a 
repository  to  implement  the  model.  The  repository 
supporting  the  product  line  approach  can  capture  the 
entities  and  their  related  attributes,  and  the  relationships 
between  these  entities  to  covey  the  model’s  views.  A  web- 
based  repository  is  a  good  choice  to  implement  the  model. 
It  provides  and  easy  access  for  many  users  internally  or 
externally  to  organizations  developing  product  lines.  The 
Web-based  repository  can  model  the  entities,  some  of  their 
related  attributes,  and  relationships  as  Hyper-text  links  to 
present  a  complete  picture  of  the  entire  product  line. 

7  CONCLUSIONS 

Organizations  that  produce  similar  systems  are  moving 
towards  implementing  the  product  line  approach.  Products 
in  the  product  line  approach  are  engineered  through 
customization  from  base  requirements  and  product  line 
architectures,  integration  of  common  components  and 
system-unique  components. 

The  model  described  in  this  paper  is  intended  to  capture  a 
view  of  the  product  line,  its  derived  products,  and  assets 
used  in  the  product  line.  The  model  is  defined  to  present 
views  interested  to  management,  system  developers,  and  a 
reuse  team  in  the  product  line  approach. 
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abstract 

The  practicing  and  researching  software  engineering 
communities  are  still  in  need  of  professional  prS 

ikphTv.  "IT-""6  tUt°ring  systems  that  can  be  easily 

successfift^entei  reuSe  exPeri^ces  from 

successfu!  enterprises  based  upon  a  validated  software 

reuse  reference  model  for  the  software  reuse  process  within 

e  general  software  development  life-cycle.  This  paper 

presents  a  public  Case-Based  System  using  a  validated 

RRM^^ii  Reuse  Reference  Model  (CBS-RRM).  A  CBS- 

allovvs  the  software  engineers  to  improve  reuse 

hZTS  ^  S  tUt°red  With  Selected  ^urse  material 
based  on  the  user  profile.  This  material  is  combined  with 

a  practice-based  knowledge  derived  from  different 
positive  cases  from  software  development  organizations' 
reuse  practices.  A  CBS-RRM  provides  software  enters 
with  a  way  to  be  tutored  using  positive  lessons  learned  by 
other  organizations.  Our  research  focuses  on  achieving 
more  effective  means  for  software  development 
organizations  to  find  alternative  educational  (traE 
solutions  to  problems  in  successful  practice  of  reuse  The 
paper  focused  only  on  the  CBS  module. 

Keywords 

Case-Based  Reasoning  Systems,  Intelligent  Tutoring, 
istance  Learning,  Learning  Environments,  Web-Based 
Training  Systems. 

1  OVERVIEW 

1.1  Intelligent  Tutoring  Systems 

Traditional  intelligent  tutoring  systems  are  based  on  the 

assumptions  that  a  student’s  thinking  process  can  be 

modeled,  traced,  and  corrected. 


Based  on  the  principles  of  Computer  Assisted  Instruction 
(CAl),  intelligent  tutoring  systems  would  allow  for  a 
generic  model  that  can  be  used  for  any  individual  There 
are  four  main  components  of  an  intelligent  tutoring  system 
The  student  module  (I)  consists  of  the  incorrect  and 
incomplete  knowledge  that  a  student  begins  with.  The 
expert  module  (2)  contains  the  correct,  expen-like 
nowledge  that  is  to  be  transferred  and  learned.  This 
transfer  of  learning  occurs  as  a  two-way  communication 

P™ -f5'  P°ssibIe  through  (3)  the  graphical  user 

interface  (GUI).  The  pedagogical  element  (4)  is  the  basis  of 
the  instruction,  and  it  determines  what  instruction  will  be 
given  at  which  point.  Some  intelligent  tutoring  systems  «o 

instruction^  ‘nCOrp°rate  fu"  simulati°n  as  pan  of  the 

The  term  "intelligent"  refers  to  the  system's  ability  to  know 
what  to  teach,  when  to  teach  it,  and  how  to  teach  it.  It  must 
have  the  capacity  to  understand,  ieam,  reason,  and  problem 
solve.  It  must  be  capable  of  identifying  a  student's  strengths 
and  weaknesses  and  establish  a  training  plan  that  is 
consistent  with  these  results.  It  can  pick  up  relevant 
learning  information  from  the  student  (such  as  leamino 
style),  and  apply  the  best  means  of  instruction  for  that 
particular  individual.  Throughout  the  instruction,  the 
system  makes  judgments  about  what  the  student  knows  and 
how  well  she/he  is  processing  the  information.  The 

!,  inic_t!on  can  then  be  tailored  to  the  student’s  needs.  [3 1 
J,  o,  22] 
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1.2  Software  Reuse  Reference  Model  (SRRM) 

In  recent  years,  reusability  has  become  an  important  factor 
in  the  process  of  software  development.  In  fact  the 
availability  of  reusable  assets  in  development  phases 
provides  valuable  support  to  design  and  implementation 
with  software  architectures  by  improving  productivity 
quality,  and  time-to-market  [14],  Industry  has  demonstrated 
that  reuse  of  software  assets  will  provide  a  basis  for 
dramatic  improvements  in  quality  and  reliability,  speed  of 
delivery,  and  in  long-term  decreases  in  costs  for  software 
development  and  maintenance.  Some  researchers  estimate 
that  even  with  a  less  than  50%  reuse  rate,  component-based 
software  development  leads  to  reliability  improvement  as 
much  as  ten  times  that  of  development  that  is  not 
component-based  [7], 

Opportunistic  software  asset  reuse  will  not  alwavs  succeed 
if  it  is  not  based  upon  a  supporting  reference  'model  for 
developing  software  [33],  Hence,  a  Software  Reuse 
Reference  Model  (SRRM)  may  be  considered  as  a  key 
starting  element  to  implement,  realize,  and  quantify  such 
savings.  The  SRRM  needs  to  include  both  technical  and 
organizational  activities  required  to  implement  reuse 
successfully. 


1.3  Case-Based  Systems  (CBS) 

Case-Based  Systems  (CBSs)  offer  a  knowledge  architecture 
system  for  managing,  sharing  and  accessing  knowledge.  A 
CBS  unifies  many  previous  forms  of  knowledge 
management  into  a  single  intuitive  mechanism  CBSs 
support  such  diverse  knowledge  types  as  structured  data, 
free-text  documents,  activity  patterns,  and  expert  svstem 
knowledge  bases.  CBSs  unify  access  methods  such  as 
query-by-example,  free-text  retrieval,  decision  trees  and 
case-based  reasoning  (CBR). 

There  are  two  primary  benefits  to  the  use  of  CBSs.  The  first 
is  to  provide  access  to  a  broad  spectrum  of  on-line 
knowledge  through  a  single  access  method.  The  second  is 
that  CBSs  are  fundamentally  superior  for  certain  tvpes  of 
access,  especially  ad-hoc  searches  for  relevant  knowledge 
to  help  answer  a  question  or  resolve  a  problem. 

The  CBS  approach  uses  the  technique  of  comparing  a 
current  situation  (e.g.  company  profile)  to  a  library*  of 
known  solutions  (e.g.  successful  professional  practices) 
CBS  has  been  applied  to  a  range  of  classification  and 
construction  tasks.  It  is  particularly  useful  in  tasks  where  a 
formal  set  of  rules,  patterns,  or  algorithms  for  generating 
solutions  is  difficult  to  obtain,  but  where  examples  of 
correct  solutions  are  readily  available.  These  "previous 
solutions"  are  stored  as  "cases”  in  a  case  base.  The  case 
base  can  be  used  for  multiple  purposes,  including  training 
and  human  and  automated  decision-making.  Because  of 
this,  a  CBS  can  keep  pace  with  a  changing  environment  by 
adding  and  improving  cases,  eliminating  the  need  for 


repeated  software  upgrades  performed  by  knowledge 
engineers.  Because  of  the  simple  knowledge  representation, 
using  case  study  templates  and  patterns,  little  expertise  is 
required  to  maintain  the  CBS.  The  CB  manager  does  not 
need  to  be  a  programmer  [  1 ,5, 24,  30] 

1.4  CBS  and  SRRM  Correlation 

It  is  necessary  for  software  developers  to  have  systematic 
procedures  supported  by  a  CBS  and  a  validated  SRRM  to 
provide  a  real  starting  point  for  good  software  assets  reuse 
and  adoption  decisions,  utilization  decisions,  and 
management  activities.  In  addition  to  a  SRRM,  an 
organization  interested  in  moving  into  a  reuse-oriented 
software  development  methodology  also  needs  more 
detailed  knowledge  about  how  to  implement  the  SRRM  in 
the  organization.  Hence,  access  to  a  CBS  with  this  more 
detailed  knowledge  would  be  very  useful. 

It  is  important  for  software  reuse  practitioners  and  new 
enterprises  that  are  interested  in  adopting  software  reuse  to 
access  lessons  learned,  access  more  detailed  knowledge 
about  how  to  implement  the  SRRM  in  the  organization,  and 
access  reuse  experience  of  successful  enterprises  based 
upon  a  validated  SRRM  for  the  software  reuse  process. 
Accessing  these  three  kinds  of  knowledge  is  but  a  first  step 
in  an  iterative  software  improvement  environment.  Usually, 
it  is  important  to  know  what  lessons  and  experiences  lead 
to  improved  software  development.  But  it  is  equally 
important  to  be  able  to  implement  and  practice  the  skills 
behind  these  lessons  and  experiences  so  that,  by  doing  and 
not  just  knowing,  measured  improvements  will  occur. 
Hence,  a  second  step  is  building  an  educational 
environment,  based  upon  individual  tutoring,  where  the 
knowledge  accessed  in  the  CBS  can  be  incorporated  into 
individualized  learning  based  on  implementation  and 
practice  of  those  skills  that  will,  in  turn,  lead  to  measured 
improvement.  Measured  improvement  can,  in  turn,  lead  to 
increased  software  assets  quality  and  increased  process 
productivity.  Section  2.3  describes  such  a  total  CBS-SRRM 
educational  environment  where  learning  based  upon 
individualized  tutoring  can  take  place. 


The  existence  of  a  publicly  accessed  Reuse  CB  (National 
Reuse  CB),  via  our  CBS-RRM  will  help  software  industry 
and  academia  capture  best  practice-based  knowledge 
derived  from  different  software  development  organizations’ 
reuse  programs  and  activities.  This  reusable'set  of  best 
practices  available  by  use  of  our  proposed  CBS-SRRM 
could  provide  software  industry  and  academia  with  a 
systematic  way  to  capture  and  access  the  lessons  learned  by 
other  organizations.  This  will  promote  recurrence  of  good 
reuse  practices  and  improve  current  reuse  processes  by 
increased  software  quality  and  decreased  effort  and  time  to 
market. 
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Having  a  set  of  case  studies  that  can  be  used  to  derive 
so  utions  to  reuse  problems  from  prior  lessons  learned  will 

ol  lCaTr.!he  foI,owinS:  (0  Describe  current 
problems  and  identify  ways  to  avoid  them  in  the  future  (2) 

rPndl  r^Pn0rtUnit'eS  3nd  POSSibIe  successes  applying 
reuse.  fy I  Derive  new  knowledge  from  ongoing  research 

projects.  (4)  Better  leverage  best  reuse  practices.  (5)  Avoid 
ISf  ®  ***  —  a„d(  izt 

°Z?£i0n  "?*  the  case  Studies  and  'earned 

reused  m°re  0ften  or§anizat'ons  that  have 

fndSrtf1  h  adopted,  utilized,  and  managed  reuse  could 
indirectly  help  organizations  with  similar  environments 
problems,  or  situations,  and  are  interested  in  adoptin°  or 

abouTbesf  S°fHWare  aSSCtS  rSUSe’  l0Cate  the  ln formation 

abom  whither  aSSeK  reUSe  PraCt,'Ces  and  decisi°ns 
TWethr  or  not  t0  adopt,  utilize  and/or  manaoe 
software  development  based  upon  reuse.  ~ 

2  WHAT  IS  MISSING 

Referring  to  our  previous  research  in  the  area  of  software 
assets  reuse  [Nada  97,  27],  the  practicing  and  researching 
software  engineering  communities  are  still  in  need  of  the 
following  professional  practice  resources: 

•  A  publicly  accessed  CBS  for  the  software  ensineerin* 
community  that  can  be  easily  used  to  identify  lesson! 
learned  and  reuse  experiences  from  successful 
enterprises  based  upon  a  validated  software  reuse 
reference  model  for  the  software  reuse  sub-process 
within  the  general  software  development  processes. 

Use  of  an  applicable,  conceptualized,  effective,  and 
validated  software  assets  Reuse  Reference  Model  that 
considers  and  incorporates  all  technical  and  non- 
technical  aspects  of  the  software  reuse  process 

•  On-line  Software  Reuse  Self-Assessment  system 

•  On-line  Software  Reuse  Individualized  Distance 
Learning  system. 

•  Identification  of  effective  software  assets  reuses 
processes  and  products  metrics. 

Identification  of  standardized  reuse  practices,  i.e. 
systematic  software  reuse  methodology. 


systemRRM  KNOlvLEDGE  BASED  tutoring 

(CAU  mS*10  °f  ComP"t"  Assis“«i  '"Struck,, 

„  ),  CBS-SRRM  tutoring  systems  would  allow  for  e 

generic  model  that  can  be  used  for  any  individual  who  i< 
involved  m  software  development  and  engineering  [31], 

3*1  CBS-SRRM  Overview 

Our  current  projeCt(  funded  by  the  NSF,  investigates 
Case-Based  System  (CBS)  tool-kits  using  a 
validated  Software  Reuse  Reference  Model  (SRRM). 


CBS-SRRM  allows  the  software  engineer  to  improve  reuse 
practice  by  capitalizing  on  effective  practice-based 
knowledge  derived  from  different  softwaredevelopment 
organizations'  reuse  practices.  CBS-SRRM  provides 
software  engineers  with  a  way  to  utilize  lessons  learned 
by  other  organizations.  The  system  also  promotes 
recurrence  of  good  reuse  practices. 


lucuses  on 


-  —  -  cucuive  means  tor 

software  development  organizations  to  find  alternative 
solutions  to  problems  in  successful  practice  of  reuse.  We 
demonstrate  that  developing  a  CBS-SRRM  that  will  allow 
software  developers  to  leam  how  organizations  similar  to 
theirs  have  successfully  adopted,  utilized  and  managed  this 
technology  can  support  improved  reuse  practices.  The  plan 
15  0  r!fearch’  develoP>  and  make  publicly  available  what 
our  affiliates  and  we  have  learned  through  our  evolving  set 

ThUdieS,’  SUrVey$-  interviews>  a"d  experimental 
results.  This  plan  is  carried  out  by  researching  and 

eve  oping  a  publicly  accessible  reuse  practices  CBS  for 
the  software  engineering  community,  using  lessons  learned 
and  reuse  experiences  from  successful  enterprises  based 
upon  a  validated  SRRM  that  incorporates  important 
technical,  organizational,  and  cultural  factors  needed  in 
adopting,  utilizing,  and  managing  reuse  technology. 


- - 

The  main  objective  of  this  research  is  to  develop  a  tutorin° 
system  including  a  knowledge-based  web-based  distant 
assessment  module  that  is  technically  supported  by  Case- 
Based  Reasoning  (CBR)  technology. 


The  objective  is  to  motivate  software  developers  to  access  a 
web-based  tutoring  system  including  an  assessment  module 
that  will  help  them  improve  their  software  development 
process  using  reuse  practices.  The  practical  implication  is 
r°  prov,de  trainees  with  a  demonstration  of  a  more 
e  icient,  more  effective,  and  publicly  accessed  assessment 
and  teaching  package  that  will  enhance  their  learning 
outcomes,  increase  their  productivity,  and  improve  their 
products  quality  in  shorter  time. 


We  have  collected,  and  continue  to  collect,  data  from 
industry  on  actual  processes  used  and  experiences  with 
softw^e  reuse.  This  data  is  collected  and  then  presented 

Th  °  a  standard  forTn  based  on  a  validated  model, 

ihe  CBS-RRM  also  provides  interface  to  allow  users  to 
describe  their  own  environment  and  objectives  and  to 
receive  the  data  corresponding  to  the  recorded  projects  that 
best  match  their  profile.  Work  such  as  this  can  be  of  great 
value  for  developers  who  are  under  increasing  economic 
pressure  to  avoid  building  each  new  system  from  the 
ground  up.  It  is  also  of  value  to  the  research  community  as 
an  empirical  basis  for  the  validation  of  claims  and  methods 
related  to  software  reuse. 
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3.3  A  CBS-SRRM  Tutoring  System 
There  are  four  main  components  of  the  tutoring  system  (1) 
The  student  profiling  module  that  will  qualify  the  student 
for  a  certain  software  engineering  domain,  and  identify  the 
student’s  or  trainee’s  (user’s)  organization  size.  (2)  The 
assessment  module  that  will  examine  and  assess  the  user’s 
previous  software  reuse  experience  and  his/her  organization 
reuse  potential,  capability,  RRM  level,  and  the  depth  of 
users’s  knowledge  and  experience  in  reuse.  This  step  will 
be  followed  by  a  pre-test  to  evaluate  the  student/trainee 
background  knowledge  on  reuse:  our  prototype  can  identify 
3  levels:  initial,  middle,  or  advanced.  Based  on  the 
outcome  of  the  previous  two  modules  and  the  results  of  the 
user’s  pre-test,  the  student  will  be  assianed  to  a  certain 
level  of  training  material.  (3)  The  CBS  module  will  use  the 
profiling  information  to  match  the  student  with  several  case 
studies,  and  present  the  best  software  reuse  practices  that 
have  been  used  by  similar  organizations.  This  module 
contains  the  correct,  expert-like  knowledge  that  is  to  be 
transferred  and  learned.  (4)  The  fourth  module  contains  the 
course  material  that  fulfills  the  users’  needs  and  matches 
their  profile. 


The  current  CBS-SRRM  tutoring  system  allows  software 
developers  to  learn  how  organizations  similar  to  theirs  have 
successfully  adopted,  utilized,  and  managed  improved 
reuse  practices  enterprises  based  upon  a  validated  SRRM 
that  incorporates  important  technical,  organizational,  and 
cultural  factors  needed  in  adopting,  utilizing,  and  managing 
reuse  technology.  We  are  researching,  developing,  and 
making  publicly  available  what  our  affiliates  and  we  have 
learned  through  our  evolving  set  of  case  studies,  surveys, 
and  interviews,  thereby  making  it  available  to  the  whole 
software  engineering  community. 

3.4  CBS-SRRM  Architecture 

Using  a  web-based  Distance  Assessment  and  Tutorin'* 
system  combined  with  the  CBR  system  will  provide  tools 
to  allow  students  and  supervisors  to  have  a  good 
educational  system  to  improve  the  individual's  skills&and 
knowledge  in  software  reuse.  The  CBS-SRRM 
Architecture  is  depicted  in  Fig.  1.  The  remaining  part  of 
this  paper  will  focus  only  on  the  CBS  module. 


Fig.  1  CBS-SRRM  Architecture 
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3_4.1  Searching  Requirements  of  the  Best  Practices  CBS 
Believing  that  analogues  may  provide  a  way  to  predict 
results  based  upon  what  has  been  true  in  the  past,  the 
CBS  s  searching  mechanism  will  be  developed  alono  the 
lines  of  searching  systems.  It  will  maintain  a  CB  of  cases 

reuse^When  PerforTCe  °f  best-Practiced  software 
reuse.  When  the  partially  known  profile  of  a  new 

s0e1rchZtL0nrSRPrrSef  1  *  the  CBS’ the  search  en§'ne  will 
itc/thp.v  the  case(s)  of  organization(s)  and 

its/their  profile(s)  that  is/are  most  similar  with  the  profile  of 

the  new  organization,  and  finally  predict  the  level  of 
practice  of  the  organization  in  the  CB  that  will  be  the  level 
of  practice  assigned  to  the  new  organization.  We  adopted 
the  following  CBS  Architecture  (Fig.  2)  [37]. 


CLIENT 


Fig.  2  CBS  Architecture 


Subjects  ^  PraCllCe  Cases:  Development  of  CB  Study 

The  participant  subjects  are  software  development 
organizations  who  (1)  have  already  been  case  study 
participants  and  who  are  initially  in  our  CB  of  best 
practices,  and  (2)  are  considering  adopting,  utilizing  and 
managing  software  reuse.  Nada  worked  on  the 
identification  and  evaluation  of  new  CB  subjects.  Initially 
each  organization  will  constitute  a  case  that  contains  the 
profile  of  certain  user  attributes.  Cases  that  include  all  of 
this  information  will  comprise  the  space  of  CB  cases.  Cases 
hat  are  lacking  the  final  software  reuse  practice  level 
assigned,  but  contain  at  least  a  subset  ofthe  remaining 
information,  will  be  considered  as  test  (input)  cases.  The 
choice  of  organizations  that  will  comprise  the  CB  cases  and 
he  organizations  that  will  comprise  the  test  cases  will  be 
pseudo-random’. 


win  2?  V t3fk’  researched  and  developed  by  our  team, 
win  be  to  find  an  appropriate  value  for  the  level  of  reuse 
practice  attribute  of  an  input  case;  therefore,  this  attribute  is 


considered  the  solution  data  for  a  particular  case  in  this 
domain. 


. 'my""  ermsrus  uj  me  nest  Practices  CBS 

During  testing  of  the  CBS’s  predictive  power  usina  new 
subjects,  the  CBS  search  engine  will  need  to  use  matching 
methods  [2,38,  9,  23,34],  Based  on  the  methods  used  to 
establish  the  similarity  between  certain  new  test  cases  and 
current  CB  cases,  the  CBS  will  compare  corresponding 
features  one  at  a  time.  * 


Each  test  case  will  contain  six  features.  The  first  two  of 
these  features  will  be  used  to  identify  the  particular 
organization  type  to  which  a  certain  organization  belongs. 

1  he  remaining  four  features  will  denote  the  partially  known 
organization  type  software  reuse  practice  level  ofthe  same 
organization,  and  they  will  be  used  as  indexing  features. 
These  four  features  are  the  organization’s  reuse  practice 
levels  in  the  first,  second,  and  third  stages  of  reuse 
adoption,  utilization,  and  management,  and  the 
organization’s  practice  level  at  the  end  of  the  evaluation 
period.  Given  this  partially  known  organization  type’s 
reuse  practice  level,  i.e.,  given  a  test  case,  the  CBS’s  task 
will  be  to  predict  the  organization’s  practice  level  within 
the  class  of  the  given  organization  type. 

This  will  be  done  by  using  the  case  CB  in  order  to  find  the 
case  or  cases  that  are  most  similar  to  the  test  case. 
Similarity  will  be  determined  by  comparison  of 
corresponding  indexing  features.  For  example, 
corresponding  indexing  features  with  identical  numerical 
values  will  receive  a  similarity  count  of  I  while 
corresponding  features  such  that  the  absolute  value  of  their 
difference  is  greater  than,  e.g.,  10  percent  will  receive  a 
similarity  count  of  0.  If  the  difference  is  less  than,  e.g.  10 
percent  then  the  similarity  count  will  be  a  numerical  value 
between  0  and  1.  The  sum  of  the  similarity  counts  for  each 
feature  will  constitute  the  degree  of  similarity  between  two 
cases;  therefore,  the  maximum  possible  match  value 
between  two  cases  will  be  equal  to  the  number  of  case 
features.  For  example,  the  previously  shown  CB  and  test 
cases  exhibit  a  certain  (e.g.  70)  percent  matching 
confidence  since  their  degree  of  similarity'  is  70  percent. 


4  CONCLUSION 

This  paper  focused  only  on  the  CBS  module.  The  paper 
presents  a  public  CBS  using  a  validated  Software  Reuse 
Reference  Model  (SRRM).  A  CBS-SRRM  allows  the 
software  engineer  to  improve  reuse  practice  by  being 
tutored  with  selected  course  material  based  on  the  student 
profile.  This  material  is  combined  with  actual  practice- 
based  knowledge  derived  from  different  positive  cases 
from  software  development  organizations’  reuse  practices. 
A  CBS-SRRM  provides  software  engineers  with  a  way  to 
be  tutored  using  positive  lessons  learned  by  other 
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organizations.  Our  research  focuses  on  achieving  more 
effective  means  for  software  development  organizations  to 
find  alternative  educational  (training)  solutions  to  problems 
in  successful  practice  of  reuse. 

Our  future  work  will  focus  on  presenting  and  integrating  a 
comprehensive  CBS  knowledge-based  tutoring  system  that 
supports  distance  learning  and  reuse  self-assessment  in 
combination  with  CBR  and  empirically  validated  SRRM. 
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ABSTRACT 

In  1994  Gibbs  claimed  that  “despite  50  years  of 
progress,  the  software  industry  remains  years— perhaps 
decades — short  of  the  mature  engineering  discipline 
needed  to  meet  the  demands  of  an  information-age  soci¬ 
ety.  "  Many  researchers  have  treated  the  problem  using 
different  approaches:  tools,  formal  methods,  prototyping, 
software  processes,  etc.  However,  this  assertion  remains 
true  today.  This  paper  considers  the  problem  from  the 
point  of  view  of  requirement  engineering  and  risk  as¬ 
sessment.  We  present  an  improvement  to  the  evolutionary 
prototyping  process  model. 


1.  Introduction 

In  complex  software  systems,  reliability  is  an  impor¬ 
tant  aspect  of  software  quality  that  has  been  elusive  in 
practice.  Since  more  and  more  human  activities  and  sys¬ 
tems  are  dependent  on  software,  achieving  the  appropri¬ 
ate  level  of  reliability  in  a  consistent  and  economical  way 
is  crucial.  Software  failures  inconvenience  people  at  best, 
and  in  extreme  cases  can  kill  them. 

Much  reliability  research  has  been  conducted  study¬ 
ing  the  behavior  of  a  system  after  it  is  operable.  This 
work  has  strong  theoretical  statistical  foundations  and 
many  of  these  models  have  been  shown  to  be  very  accu¬ 
rate.  However,  post-mortem  analysis  of  the  behavior  of  a 
system  gives  insights  too  late  to  be  useful  for  software 
development. 

This  paper  describes  a  way  to  improve  reliability  of 
systems  from  the  beginning  of  the  process.  Studies  have 
shown  that  early  parts  of  the  system  development  cycle 
such  as  requirements  and  design  specifications  are  espe¬ 


cially  prone  to  errors.  Problems  originating  in  the  early 
stages  often  have  a  lasting  influence  on  the  reliability, 
safety  and  cost  of  the  system.  In  early  stages  we  cannot 
directly  assess  reliability  of  products  that  do  not  exist  yet, 
but  we  can  assess  risks  that  could  contribute  in  the  future 
to  the  lack  of  reliability,  quality  and  usefulness  of  the 
system. 

Evolutionary  prototyping  offers  an  iterative  approach 
to  requirement  engineering  to  alleviate  the  problems  of 
uncertainty,  ambiguity  and  inconsistency  inherent  in  the 
process.  Moreover,  prototyping  can  improve  the  capture 
of  change  in  requirements  and  assumptions  during  the 
development  process.  This  effect  is  particularly  notorious 
in  projects  involving  multiple  stakeholders  with  different 
points  of  view. 

Computer  Aided  Prototyping  System  (CAPS)  [1]  is  a 
CASE  tool  that  provides  a  collection  of  techniques  and 
languages  for  computer-aided  prototyping,  including 
logical  assessment  of  the  consistency  and  clarity  of  re¬ 
quirements  and  specifications.  CAPS  methods  involve  the 
use  of  real-time  constraints  and  abstract  modeling  to  de¬ 
scribe  the  requirements  in  a  clear,  precise,  consistent  and 
executable  format.  Prototypes  can  be  applied  to  demon¬ 
strate  system  scenarios  to  the  affected  parties  as  a  way  to: 
a)  collect  criticisms  and  feedback  that  are  sources  for  new 
requirements;  b)  early  detection  of  deviations  from  users* 
expectations;  c)  trace  the  evolution  of  the  requirements; 
and  d)  improve  the  communication  and  integration  of  the 
users  and  the  development  personnel. 

2.  CAPS  (Computer  Aided  Prototyping  Sys¬ 
tem) 

Real  time  systems  present  special  difficulties  in  terms 
of  requirement  engineering.  Some  requirements  are  diffi¬ 
cult  for  the  user  to  provide  and  for  the  analysts  difficult  to 
determine.  The  best  way  to  discover  these  hidden  re¬ 
quirements  is  via  prototyping.  CAPS  is  a  tool  specially 
suited  for  this  task.  It  has  a  graphical  easy  to  understand 
interface  that  maps  to  a  specification  language,  which  in 
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turns  generates  Ada  code.  The  main  components  of 
CAPS  are: 

(a)  The  prototype  system  description  language  (PSDL). 
PSDL  is  based  on  data  flow  under  real-time  con¬ 
straints.  It  uses  an  enhanced  data  flow  diagram  that 
includes  non-procedural  control  and  timing  con¬ 
straints. 

(b)  User  interface  based  on  a  graphic  editor  with  a  pal¬ 
ette  of  objects  that  include  operators,  inputs,  out¬ 
puts,  data  flows  and  operator  loops.  A  search  engine 
helps  the  designer  to  find  reusable  components. 

(c)  The  software  database  system  provides  a  repository 
for  reusable  PSDL  components. 

(d)  The  execution  support  system  consists  of  a  transla¬ 
tor,  scheduling  mechanisms,  execution  monitors, 
and  a  debugger. 

The  prototyping  process  consists  of  prototype  con¬ 
struction  and  modification  (evolution)  based  on  evolving 
requirements  and  code  generation.  Both  construction  and 
modification  are  exploratory  activities  with  a  common 
target:  to  satisfy  multiple  users  with  different  and  often 
conflicting  points  of  view.  Requirement  engineering  is  a 
consensus  driven  activity  in  which  mechanisms  for  con¬ 
flict  resolution  and  traceability  of  requirement  evolution 
represent  critical  success  factors. 

3.  REMAP  (Representation  and  Mainte¬ 
nance  of  Process  Knowledge) 

The  REMAP  model  [2]  represents  the  conflict  reso¬ 
lution  of  requirements  in  a  multiple  stakeholder  environ¬ 
ment.  It  is  an  improvement  of  the  IBIS  model  introduced 
by  [3].  Figure  1  shows  the  conceptual  model  of  REMAP. 

Requirements  are  the  main  input  and  output  of  each 
demonstration  of  the  prototype.  Initially,  a  small  set  of 
requirements  is  collected.  The  requirements  generate  con¬ 
troversy  between  different  stakeholders.  The  argumenta¬ 
tion  process  is  covered  by  the  extension  to  the  IBIS 
model.  The  primitives  of  IBIS  are  issues,  positions  and 
arguments.  Issues  are  questions  or  concerns.  Positions 
represent  the  points  of  view  of  different  stakeholders. 
Arguments  can  support  or  object  to  positions,  and  are 
based  on  assumptions.  Design  decisions  resolve  issues 
introducing  constraints,  which  define  design  artifacts. 


The  requirement  engineering  process  transforms  ini¬ 
tial  requirements  that  usually  are  informal  and  imprecise 
into  more  technical  and  precise  specifications.  Specifica¬ 
tions  are  required  for  practical  development  purposes  and 
can  be  understood  by  engineers.  However,  they  are  not 
well  understood  by  users.  So,  it  is  necessary  to  provide  a 
full  spectrum  of  descriptions.  For  that  reason,  the  primi¬ 
tives  of  REMAP  have  been  integrated  into  the  graph 
model  [4]  in  successive  efforts  [5]  and  [6]. 

4.  The  Graph  Model 

The  graph  model  is  a  data  graph  model  for  evolution 
that  records  dependencies  and  supports  automatic  project 
planning,  scheduling,  and  configuration  management.  The 
evolution  process  is  represented  by  a  graph  that  at  any 
given  moment  models  the  current  and  the  past  state  of  the 
software  system  as  well  as  planned  future  states. 

The  model  views  a  software  evolution  process  as  a 
partially  ordered  set  of  steps.  Steps  represent  activities 
required  to  produce  the  system.  A  step  has  states  that 
reflect  the  dynamic  progression  of  the  activity  from  the 
moment  that  it  is  proposed  to  the  moment  it  is  completed 
or  abandoned. 

The  graph  model  has  experienced  its  own  evolution 
process.  Luqi  [1]  introduced  a  primitive  version  of  the 
model.  Mostov  and  Luqi  [7,  4]  refined  and  elaborated  the 
model.  In  [4],  the  notion  of  hypergraph  was  introduced  to 
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realize  automated  software  evolution  in  multidimensional 
phases.  Further  refinements  including  scheduling  and 
team  coordination,  were  introduced  by  [8].  Conflict  reso¬ 
lution  of  requirements  and  criticisms  introduced  by 
Ramesh  [2]  and  Ibrahim  [5].  Luqi  [9]  extended  the  graph 
model  to  a  hierarchical  hypergraph  that  improved  the 
traceability  of  the  dependencies  and  introduced  the  con¬ 
cept  of  hyper-requirements.  Finally,  Ham  extended  the 
model  to  a  relational  hypergraph  model  [6]. 

5.  Risk  assessment  driven  software  evolu¬ 
tion 

Experience  suggests  that  building  and  integrating 
software  by  mechanically  processable  formal  models 
leads  to  cheaper,  and  more  reliable  products  sooner. 
Software  development  processes  such  the  hypergraph 
model  for  software  evolution,  or  the  spiral  model  [10], 
have  improved  the  state  of  the  art.  However,  they  have  a 
common  weakness:  risk  assessment. 

In  the  software  evolution  domain  risk  assessment  has 
not  been  addressed  as  part  of  the  model.  In  the  various 
enhancements  and  extensions  the  graph  model  did  not 
include  risk  assessment  steps,  hence  risk  management 
remains  as  a  human-dependent  activity  that  requires  ex¬ 
pertise. 

In  the  evaluation  of  the  spiral  model,  one  of  the  diffi¬ 
culties  mentioned  by  Boehm  was:  " Relying  on  risk- 
assessment  expertise.  The  spiral  model  places  a  great 
deal  of  reliance  on  the  ability  of  software  developers  to 
identify  and  manage  sources  of  project  risk. "  "... Another 
concern  is  that  a  risk-driven  specification  will  also  be 
people-dependent.  "[10]. 

Many  researches  have  addressed  the  problem  of  risk 
assessment  following  the  perspective  of  the  traditional 
disciplines.  The  tools  for  risk  assessment  are  guidelines 
for  practices,  checklists,  taxonomies  of  risk  factors  and 
few  metrics.  All  these  methods  work  fine  IF  carried  out 
by  a  human  educated  on  risk  assessment  AND  with 
enough  experience.  Unfortunately,  such  resources  are 
really  scarce. 

From  the  point  of  view  of  software  engineering,  it  is 
necessary  to  create  a  method  to  support  the  decision¬ 
making  process  during  the  early  stages  of  the  life  cycle, 
when  changes  can  be  made  with  less  impact  on  the  budget 
and  schedule.  In  our  vision,  software  risk  management 
deals  with  how  to  administrate  complexity  and  how  to 
assign  resources.  We  propose  to  separate  risk  assessment 
into  three  classes:  resource  risk,  process  risk  and  product 
risk. 

Resource  risk  is  the  amount  of  project  risk  created  by 
threats  imposed  by  available  resources.  It  is  affected  by 
organizational,  operational,  managerial  and  contractual 


parameters  such  as  outsourcing,  personnel,  time  and 
budget.  The  literature  is  abundant  in  this  area  [11,  12]. 
Various  approaches  use  subjective  techniques  such  as 
guidelines  and  checklists  [13],  [11],  which  require  the 
opinion  of  an  expert  even  when  they  could  be  supported 
by  metrics,  [12]  has  introduced  a  more  rigorous  method. 
In  this  approach,  the  risk  is  viewed  as  a  three  dimensional 
entity  that  depends  on  schedule  risk  schedule,  cost  risk 
and  technical  risk. 

The  process  risk  is  the  amount  of  the  project  risk 
caused  by  management  work  procedures  such  as  plan¬ 
ning,  quality  assurance,  and  configuration  management.  It 
is  also  caused  by  technical  work  procedures  related  to  the 
software  processes  such  as  requirements,  analysis,  design, 
code  generation,  testing,  etc.  The  more  complex  a  process 
is,  the  more  difficult  it  is  to  manage.  More  education, 
training,  standards,  reviews,  and  communication  are  re¬ 
quired.  Consequently,  complexity  grows.  Software  proc¬ 
ess  complexity  has  been  partially  addressed  by  research 
in  terms  of  subjective  assessments  about  maturity  level 
and  expertise  [13,  1 1,  14].  However,  we  seek  a  more  pre¬ 
cise  and  objective  method.  Several  approaches  to  study 
process  complexity  in  a  static  way  have  been  introduced 
in  the  field  of  management.  Simulation  can  be  used  to 
measure  the  complexity  of  the  dynamics  of  the  processes. 

Finally,  product  risk  is  related  to  the  final  character¬ 
istics  of  the  product,  its  conformance  with  specifications 
and  requirements,  its  reliability  and  customer  satisfaction. 

We  think  that  there  exists  a  dependency  between 
these  classes  of  risk.  The  success  of  the  project  depends 
on  its  own  characteristics  and  in  the  success  of  the  prod¬ 
uct  and  the  process.  The  success  of  the  process  depends 
on  itself  as  well  as  in  the  success  of  the  project  and  the 
product.  And  the  success  of  the  product  depends  on  itself 
and  on  the  success  of  the  project  and  the  process.  The 
dependencies  among  the  three  areas  constitute  an  equiva¬ 
lence  relation  because  the  symmetric,  transitive  and  re¬ 
flexive  properties  apply.  In  our  view,  this  reflects  the  fact 
that  resources,  process  and  product  are  different  facets  of 
the  same  entity:  the  project. 

Dealing  with  threats,  the  decision-maker  can  apply 
the  following  strategies: 

•  Risk  absorption,  which  is  to  assume  the  conse¬ 
quences  of  the  risk  as  a  constraint. 

•  Risk  avoidance,  which  eliminate  the  possibility  of  the 
risk  following  turn  around  solutions  avoiding  the 
threat. 

•  Risk  prevention,  which  is  the  typical  situation.  Pro¬ 
tection,  mitigation  and  anticipation  are  the  key  fac¬ 
tors  to  reduce  risk. 

•  Risk  transfer,  which  implies  the  shift  of  the  conse¬ 
quences  of  the  risk  to  another  organization. 


•  Risk  contingency,  which  implies  the  use  of  reserve 

resources  to  mitigate  an  actual  threat  according  to  a 

previously  established  contingency  plan. 

6,  The  proposed  model  for  risk  assessment 

Transforming  the  unstructured  problem  of  risk  as¬ 
sessment  leads  to  an  objective  method  able  to  be  trans¬ 
lated  into  an  algorithm.  In  order  to  structure  the  problem, 
we  decompose  risk  assessment  of  an  engineering  project 
in  two  different  visions.  First,  a  micro-vision  is  required 
for  threat  resolution.  This  micro-vision  risk  assessment 
relates  to  the  identification  of  the  threats,  the  decision¬ 
making  process  to  address  the  problem,  and  the  formal¬ 
ization  of  the  solution  in  a  plan. 

The  micro  vision  is  necessary  but  not  sufficient  be¬ 
cause  it  is  impossible  to  manage  a  project  without  a 
global  scenario.  Hence,  a  macro  vision  approach  is  also 
required.  The  macro  vision  approach  relates  to  the  inte¬ 
gration  of  the  evaluation  made  for  each  of  the  threats.  The 
macro-vision  risk  assessment  of  the  project  includes  three 
risk  components:  process,  product  and  resources. 

6.1.  Micro-vision 

The  decision-maker  is  positioned  on  the  root  of  a  de¬ 
cision  tree,  where  each  branch  represents  a  course  of  ac¬ 
tion  that  implies  costs  and  probabilities  of  success.  When 
a  threat  is  identified,  two  possible  choices  are  available: 
to  avoid  the  threat  or  to  deal  with  it.  Avoiding  a  threat  is 
usually  associated  with  represent  some  costs.  Typically, 
avoiding  a  threat  implies  finding  a  turn  around  that  can 
have  effects  on  schedule,  budget  or  even  on  functionality. 

If  the  decision-maker  opts  to  deal  with  the  threat, 
then  three  possible  courses  of  action  are  available:  to  pre¬ 
vent,  to  wait,  or  to  transfer  the  threat.  Prevention  and 
transfer  could  have  associated  costs.  The  waiting  strategy 
postpones  the  use  of  resources  in  the  hope  that  the  threat 
will  not  appear,  trying  to  trade  information  for  time. 

Even  if  applying  prevention,  there  is  no  absolute 
guarantee  that  the  threat  will  not  appear.  In  these  cases 
the  decision-maker  can  apply  a  contingency  plan  that 
introduces  new  costs.  Again  the  contingency  plan  cannot 
guarantee  absolute  effectiveness. 

If  we  know  or  can  estimate  the  probability  of  each 
branch  representing  a  state  of  nature,  it  is  possible  to  cal¬ 
culate  the  expected  outcome  for  each  one  as  the  weighted 
sum  of  outcomes.  So,  we  can  arrive  to  the  root  with  a 
value  for  the  expected  cost.  The  path  that  produces  an 
optimal  expected  solution  contains  the  recommended 
course  of  action. 

To  solve  the  uncertainties,  subjective  estimation  of 
the  probabilities  of  occurrence  of  the  different  states  of 
nature  can  be  applied.  This  approach  is  easy  to  implement 


but  requires  a  great  deal  of  experience  to  judge  accurately 
the  success  probability  of  each  alternative.  Group  consen¬ 
sus  techniques  (like  the  Delphi  method)  are  usually  very 
helpful  in  such  situations. 

Decisions  trees  based  on  the  expected  monetary 
value  (EMV)  could  lead  to  bad  decisions  because  in  the 
most  common  case  the  decision-maker  is  confronted  with 
a  multiattribute  problem.  Moreover,  different  people  have 
different  attitudes  toward  risk.  This  issue  is  applying  util¬ 
ity  theory.  The  decision-maker  must  provide  his  estima¬ 
tion  of  return  for  each  attribute  related  to  the  decision,  as 
a  vector  R  =  (Rl,  R2,  ...,  Rn).  The  decision-maker  must 
introduce  also  his  preferences  as  a  weight  vector  W  - 
(Wl,  W2,  ...,  Wn).  The  outcomes  of  each  attribute  are 
given  by  Ai,  such  that: 

n 

Ai  =  Wi  *  Ri  ,  where  I  Wi  =  1 
i  =  0 

The  outcome  for  each  alternative  is  then  calculated 
as  a  function  of  the  sum  of  the  attributes  (AI,  A2, ...,  An) 
converted  to  a  value  between  0  and  1,  where  1  is  given  to 
the  best  outcome  and  0  to  the  worst. 

6.2.  Macro-vision 

As  we  stated  previously,  the  macro-vision  approach 
integrates  the  assessments  done  for  each  of  the  identified 
threats.  Moreover,  the  macro-vision  risk  can  be  used  to 
find  threats  in  an  automated  way.  The  risk  assessment  for 
the  project  is  done  by  the  integration  of  three  risk  factors 
(process,  resources  and  product),  plus  two  customization 
factors  (decision-maker's  perceptions  of  the  environment 
and  decision-maker’s  preferences). 

The  process  introduces  risk  as  consequence  of  its  re¬ 
quirements  and  characteristics:  complexity,  technology 
required,  budget  required,  schedule  required,  and  person¬ 
nel  skills  required.  The  process  provides  the  description 
of  its  environment  and  the  theoretical  requirements  to 
execute  it. 

The  resources  represent  the  actual  allowances  in  per¬ 
sonnel,  tools,  budget  and  schedule.  The  resources  impose 
constraints  that  may  not  match  the  process  requirements. 
These  mismatches  are  a  source  for  threats  that  can  be 
identified  automatically. 

The  product  introduces  its  own  risk  in  terms  of  quan¬ 
titative  and  qualitative  attributes.  We  identified  two  basic 
product-risk  factors:  requirement  conflicts,  and  require¬ 
ment  complexity.  The  second  one  is  consequence  of  the 
functional  complexity  of  the  requirements  and  the  quality 
target  defined  in  terms  of  reliability,  maintainability  and 
usefulness. 

The  risk  assessment  of  the  project  can  be  structured 
as  the  evaluation  of  the  complexities  and  the  degree  of 
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mismatch  from  the  product  and  process  characteristics,  to 
the  resource  constraints.  The  process  of  collecting  risk 
metrics  can  be  automated  at  least  for  the  principal  factors. 
Hence,  project  risk  can  be  assessed  using  an  automated 
tool. 

7.  Metrics 

Metrics  are  a  key  factor  in  the  identification  of 
threats.  Without  metrics  it  is  not  possible  to  provide  early 
alerts  of  risks.  In  this  section  we  describe  a  set  of  metrics 
that  support  our  risk  identification  strategy.  All  the  met¬ 
rics  presented  here  are  well  formed,  in  the  sense  that  they 
present  the  following  strengths: 

•  Robust  in  terms  of  the  verification  of  their  outputs. 

•  Repeatable.  Different  observers  would  arrive  at  the 
same  measurement  regardless  of  the  number  of  repeti¬ 
tions. 

•  Simple.  We  use  the  least  number  of  parameters  suffi¬ 
cient  to  obtain  an  accurate  measurement. 

•  Easy  to  calculate.  They  do  not  require  complex  algo¬ 
rithms  or  processes. 

•  Automatically  collected.  There  is  no  need  of  human 
intervention. 

7.1  Metrics  for  Requirements 

We  define  birth  rate  (BR)  as  the  percentage  of  new 
requirements  incorporated  in  each  cycle  of  the  evolution 
process.  This  metric  shows  the  explosion  of  new  require¬ 
ments  as  a  percentage. 

BR  %  =  (NR  /  TR)  *  100,  where 

NR  =  number  of  new  requirements, 

TR  =  total  number  of  requirements 
TR  =  PR  +  NR,  where  PR  denotes  the  number  of  re¬ 
quirements  in  the  previous  version. 

We  define  death  rate  (DR)  as  the  percentage  of  re¬ 
quirements  that  are  dropped  by  the  customer  in  each  cy¬ 
cle  of  the  evolution  process. 

DR  %  =  (DelR  /  TR)  *  100,  where 

DelR  =  number  of  requirements  deleted, 

TR  =  total  number  of  requirements  (before  deletion)  =  PR 
+  NR. 

We  define  change-rate  (CR)  as  the  percentage  of  re¬ 
quirements  changed  from  the  previous  version. 

CR  (%)  =  (ModR  /  TR)  *  100 

where  ModR  =  number  of  requirements  changed. 

From  the  point  of  view  of  the  metrics,  a  change  on  a 
requirement  can  be  viewed  as  a  death  of  the  old  version 


and  a  birth  of  the  new  one.  This  simplification  does  not 
imply  that  we  lose  the  history  of  the  evolution.  The  trace- 
ability  of  the  evolution  remains  in  the  hypergraph  model. 


death-rate 

Figure  2:  Evolution  of  requirements  in  a  project 

The  simplification  just  described,  enables  us  to  com¬ 
pare  birth  rate  and  death  rate  in  a  two-dimensional  plot 
that  shows  four  regions:  stability  region,  growing  region, 
volatility  region  and  shrinking  region  (fig.  2).  The  graph 
is  double  logarithmic,  so  the  borders  of  the  four  regions 
are  in  the  10%  value.  Each  of  these  regions  has  different 
risk  connotations. 

The  arrow  shows  the  normal  evolution  of  a  project  as 
the  time  goes  by.  During  early  stages,  it  is  normal  for 
projects  to  be  in  the  growing  region.  However,  if  the  pro¬ 
ject  continues  in  this  region  after  many  cycles,  or  return 
to  this  region  after  visiting  other  regions,  something 
wrong  is  happening.  The  first  case,  this  is  an  indicator 
that  the  requirement  engineering  is  not  efficient;  hence 
some  corrective  action  should  be  applied.  The  second 
case,  shows  evidence  of  late  discovery  of  some  cluster  of 
hidden  requirements. 

After  some  cycles,  the  project  should  be  in  the  vola¬ 
tile  region.  If  the  project  does  not  evolve  into  the  stability 
region,  then  there  is  evidence  that  the  requirements  engi¬ 
neering  activity  is  not  being  efficient  and  some  corrective 
action  is  mandatory.  It  is  important  to  analyze  the  evolu¬ 
tion  of  the  stakeholder’s  issues  and  criticisms.  It  could  be 
also  the  case  that  stakeholders  have  changed  their  minds. 

If  the  project  evolves  to  the  shrinking  region,  and  the 
requirements  engineering  is  working  properly,  there  is 
evidence  that  the  customers  are  cutting  down  the  project. 
This  can  be  an  indicator  of  a  severe  cut  in  the  budget. 

Finally,  any  involution  to  a  previous  region  should  be 
considered  as  evidence  of  threats.  In  such  cases  a  detailed 
analysis  is  required  to  assess  the  causes  of  the  anomaly. 
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This  set  of  metrics  can  be  collected  automatically 
form  the  hypergraph  and  can  give  early  alerts  of  the 
threats. 

7.2  Metrics  for  Complexity 

Complexity  has  a  direct  impact  on  quality  because  the 
likelihood  that  a  component  fails  is  directly  related  to  its 
complexity.  The  quality  of  the  product  can  only  be  de¬ 
termined  at  the  end  of  the  process.  Hence,  it  is  important 
to  measure  the  complexity  as  predictor. 

Real  time  systems  present  special  difficulties  in  terms 
of  requirement  engineering.  Some  requirements  are  diffi¬ 
cult  for  the  user  to  provide  and  for  the  analysts  difficult  to 
determine.  The  best  way  to  discover  these  hidden  re¬ 
quirements  is  via  prototyping.  CAPS  is  a  CASE  tool  spe¬ 
cially  suited  for  this  task. 

The  prototyping  process  consists  of  prototype  con¬ 
struction  and  modification  (evolution)  based  on  evolving 
requirements  and  code  generation.  Both  construction  and 
modification  are  exploratory  activities  with  a  common 
target:  to  satisfy  multiple  users  with  different  and  often 
conflicting  points  of  view.  Requirement  engineering  is  a 
consensus  driven  activity  in  which  mechanisms  for  con¬ 
flict  resolution  and  traceability  of  requirement  evolution 
represent  critical  success  factors. 

Specifications  written  in  PSDL,  the  prototyping  lan¬ 
guage  used  in  CAPS,  are  suitable  for  being  analyzed  to 
compute  their  complexity.  In  PSDL  code  we  observe  the 
following  components:  types,  operators,  data  streams  and 
constraints.  Types  are  declarations  of  abstract  data  types 
required  for  the  system.  Operators  and  data  streams  are 
the  components  of  a  dataflow'  graph.  Finally,  constraints 
represent  guard  conditions  and  real-time  constraints  that 
the  system  must  support. 

We  define  two  complexity  metrics  for  PSDL:  Fine 
Granularity  Complexity  metric  (FGC),  and  Large  Granu¬ 
larity  Complexity  metric  (LGC).  The  reason  to  compute 
different  metrics  is  because  we  want  to  detect  two  classes 
of  threats.  First,  we  need  to  be  aware  of  operators  that  are 
too  complex.  High  complexity  on  one  operator  could  be 
caused  by  poor  design  and  possible  can  be  solved  by  fur¬ 
ther  decomposition.  Second,  we  require  a  metric  to  com¬ 
pute  the  total  complexity  of  the  system. 

FGC  expresses  the  complexity  of  each  operator  in 
the  system  and  is  a  function  of  the  fan-in  and  fan-out  data 
streams  related  to  the  operator. 

FGC  =  fan-in  +  fan-out 

LGC  expresses  the  complexity  of  the  system  as  a 
function  of  the  number  of  operators,  data  streams,  and 
types. 

LGC  =  O  +  D  +  T 


Figure  3:  Correlation  between  PSDL  and  LGC 


We  examined  the  correlation  between  LGC  and  size 
of  the  specifications  and  the  code.  We  observed  a  very 
strong  correlation  between  PSDL  lines  of  code  and  LGC 
(R  =  0.996)  (fig.  3).  The  correlation  between  non¬ 
comment  Ada  lines  of  code  of  the  projects  with  their 
complexity  measured  using  LGC,  we  observe  a  strong 
correlation  also  (R  =  0.898)  (fig.  4).  Our  complexity  met¬ 
ric  correlates  better  with  PSDL  than  with  Ada.  The  rea¬ 
son  for  this  difference  is  because  CAPS  automatically 
generates  PSDL.  On  the  other  hand,  even  if  CAPS  gener¬ 
ates  part  of  the  Ada  code,  the  designer  can  add  and  mod¬ 
ify  the  generated  code  introducing  more  variability.  The 
following  graph  shows  the  correlation  observed  for  the 
same  set  of  projects. 


Figure  4:  Correlation  between  NCLOC  (Ada)  and  LGC 


A  caveat  of  this  study  is  that  our  sample  is  too  small. 
It  includes  all  information  we  have  available  at  the  mo¬ 
ment.  However,  the  study  suggests  the  possibility  to  esti¬ 
mate  code  size  in  terms  of  requirement  complexity  with 
useful  levels  of  accuracy. 

8.  Integration  with  the  graph  model 

The  graph  model  has  advantage  of  being  easily  ex¬ 
pandable.  The  model  is  based  on  a  hypergraph  G  =  (N,  E, 
I,  0)  where  N  is  a  set  of  nodes  that  represent  the  software 
components  and  related  documents;  E  is  a  set  of  edges 
that  represent  the  steps  or  tasks  required  by  the  process;  I 
and  O  are  functions  that  permit  the  navigation  forward 
and  backward  in  the  graph.  Risk  assessment  activities  can 
easily  be  incorporated  to  the  model  by  the  extension  of 
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the  class  of  edges.  Figure  5  represents  the  software  evo¬ 
lutionary  prototyping  software  process.  Figure  6  shows 
the  proposed  software  process  improvement.  From  the 
specifications  we  can  derive  the  complexity  of  the  prod¬ 
uct.  This  information  is  used  together  with  personnel  and 
organizational  information,  and  with  metrics  of  require¬ 
ments  collected  from  the  baselines,  to  produce  the  risk 
assessment.  The  risk  assessment  step  integrates  these 
measures  with  issues  created  by  the  application  of  the 
REMAP  model  in  the  issue  analysis  steps.  The  automated 
risk  assessment  provides  the  decision-maker  with  objec¬ 
tive  and  reliable  information. 


9.  Conclusion 

We  introduced  a  framework  and  metrics  able  to 
structure  the  risk  assessment  problem  and  to  solve  it  by 
automated  tools.  Further  experiments  should  be  con¬ 
ducted  to  validate  our  preliminary  observations  on  com¬ 
plexity  and  size. 

We  found  a  method  to  solve  the  problem  of  human 
dependency  in  risk  assessment.  This  method  was  de¬ 
signed  for  the  graph  model,  however  it  can  be  customized 
to  any  evolutionary  prototyping  software  process. 


Figure  6:  The  proposed  process 
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Abstract 

This  paper  discusses  the  problems  of  software  engineering  as  the  weakest  link  in  the  development 
of  systerm  capable  of  achieving  information  superiority.  Fast  changes  in  technology  introduce  ad- 
itional  difficulties  in  terms  of  strategic  planning,  organizational  structure,  and”  enaineerin^  of 
software  development  projects.  In  such  complex  environment,  a  new  way  of  thinkings  required. 

u  anaif  tHe  introdLlctlon  of  complex  adaptive  systems  as  an  alternative  for  planning  and 
change.  The  strategy  of  competition  on  the  edge  of  chaos  is  analyzed  showing  the  risks  and  the 
skilU  required  navigating  on  the  edge.  We  discuss  the  feasibility  of  using  this  theory  in  software 
engineering  as  an  alternative  to  bureaucratic  software  development  processes.  We  present  also 
some  recommendations  that  could  help  to  acquire  competitive  advantage  in  software  develop- 
ment,  hence  achieve  information  superiority. 

1.  Introduction 

As  software  systems  increased  in  complexity,  software  development  evolved  form  a  primitive  art 
into  software  engineering.  Methodologies  and  software  tools  were  developed  to  help  develop¬ 
ment  processes.  Most  of  the  present  tendencies  (DOD-STD-2167A.  ISO-9001.  SEI/CMM)  try  to 
standardize  processes,  emphasizing  planning  and  structure  (  Humphrey.  1990).  Some  authors  criti- 
^'Ze  Jh05^  aPProaches  statm§  that  *ey  underestimate  the  dynamics  of  the  software  development 
(Bach,  1994),  (Abdel-Hamid,  1997).  Others  question  that  activities  such  as  research  and  devel¬ 
opment  are  not  addressed  by  TQM  principles  (Dooley  et  al..  1994). 

In  1994  Gibbs  claimed  “despite  50  years  of  progress,  the  software  industry'  remains  years- 

perhaps  decades— short  of  the  mature  engineering  discipline  needed  to  meet  the  demands  of  an 

information-age  society.”  Many  researchers  have  treated  the  problem  using  different  approaches- 

tools,  formal  methods,  prototyping,  software  processes,  etc.  However,  this  assertion  remains  true 
today. 

The  typical  software  engineering  process  is  a  succession  of  decision  problems  trying  to  transform 
a  set  o  fuzzy  expectations  into  requirements,  specifications,  designs,  and  finally  code  and  docu¬ 
mentation.  The  traditional  waterfall  software  process  failed  to  accomplish  their  purpose  because  it 
applied  a  method  valid  for  w'ell-defined  and  quasi-static  scenarios.  This  hypothesis  is  far  from  the 
rea  ity.  Today,  modem  software  processes  (Boehm,  1988).  (Luqi,  19S9)  are  based  on  evolution 
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and  prototyping.  These  approaches  recognize  the  fact  that  software  development  presents  an  ill- 
defined  decision  problem  and  they  fail  in  assessing  automatically  the  risk. 

In  our  view,  software  development  projects  present  special  characteristics  that  require  to  be 
solved  in  order  to  achieve  an  improvement  in  the  state  of  the  art.  These  particularities  affect  the 
strategic  planning,  the  organizational  structure,  and  the  engineering  applied  to  software.  In  these 
three  areas  chaos  theory  can  provide  clues  for  possible  solutions. 

2.  The  strategic  planning  issue 

Traditional  approaches  to  strategic  planning  emphasize  picking  a  unique  strategy  according  to  the 
competitive  advantages  of  each  organization.  Porter's  five-force  approach  (Porter,  1980),  as¬ 
sumes  that  there  exists  some  degree  of  accuracy  in  the  prediction  of  which  industries  and  which 
strategic  positions  are  viable  and  for  how  long. 

In  a  high-velocity  scenario  the  assumption  of  a  stable  environment  is  too  restrictive.  Customers, 
providers,  competitors,  and  potential  competitors,  as  well  as  substitute  products  are  evolving 
faster  than  expected.  The  introduction  of  new  information  technology  tools,  the  Internet  and  the 
globalization  of  the  markets  are  contributing  to  this  phenomenon,  and  nothing  seems  to  reverse 
the  process.  The  failure  of  long-term  strategic  planning  is  not  a  failure  of  management;  it  is  the 
normal  outcome  in  a  complex  and  unpredictable  environment.  A  growing  number  of  consultants 
and  academics  (Santosus,  1998),  (Brown  &  Eisenhardt,  1999)  are  looking  at  complexity  theory, 
to  help  decision-makers  improve  the  way  they  lead  organizations. 

How  useful  could  a  map  of  a  territory  that  is  constantly  changing  its  topography  be?  In  fast 
changing  environments,  survival  requires  a  refined  ability  to  sense  the  external  variables.  Tradi¬ 
tional  approaches  rely  on  strategic  planning  and  vision.  However,  in  unstable  environments  plan¬ 
ning  would  not  be  effective  because  it  is  impossible  to  predict  the  scenario's  evolution  in  terms  of 
markets,  technologies,  customer's  needs,  etc.  Organizations  relying  only  on  one  vision  supported 
by  a  tight  planning,  risk  paying  little  attention  to  the  future.  Consequently,  their  sensing  organs  are 
blind  to  foresight  the  future.  A  certain  amount  of  inertia  and  commitment  to  the  plans  is  required 
to  prevent  erratic  changes  caused  by  reaction  diverse  variables. 

If  the  time  window  of  the  opportunities  is  shrinking,  a  different  form  of  thinking  is  required.  The 
present  technological  situation  can  be  described  as  a  fast  succession  of  short-term  niches.  The 
ability  to  change  is  the  key  of  success  for  surviving  in  such  a  variable  environment.  In  a  systemic 
approach,  the  General  Systems  Theory  establishes  that  organizations  are  systems  whose  viability 
depends  on  some  basic  behaviors  (von  Bertalanfy,  1976): 

(a)  Ability  to  sense  changes  in  the  environment.  This  is  the  most  primitive  form  of  intelligence,  if 
it  is  not  present  the  probabilities  of  survive  are  minimum. 

(b)  Ability  to  adapt  to  a  new  environment  modifying  the  internal  structure  and  behavior.  The  sys¬ 
tem  tries  to  auto-regulate  to  survive  the  crisis  in  hostile  scenarios,  or  take  advantage  of  the 
opportunities  in  favorable  ones. 

(c)  Ability  to  leam  from  the  past,  anticipating  the  auto-regulation  behaviors  and  structure  before 
the  environment  change.  This  ability  requires  intelligence  able  to  infer  conclusions  from  the 
past  according  to  the  context  of  the  variables  sensed  on  the  present. 


(<1)  lblilty  !°  !utr0dUCe  ?angeS  in  the  environment,  making  it  more  favorable  to  the  system’s 
needs.  In  this  case,  the  system  has  developed  the  technology  (know  how  and  tools)  to  exert 
power  over  the  environment. 

Any  mechanical  or  computing  system  has  some  or  all  of  these  abilities.  We  find  these  same  abili- 

^  °,f  -fe'  m°re  devel°Ped  the  astern  is,  the  more  of  the  above  characteristics 
has.  Damans  Evolution  Theory'  validates  this  line  of  reasoning.  Natural  selection,  acting  on  inher- 
ned  genetic  variation  through  successive  generations  over  the  time  is  the  form  of  evolution, 
n  ion  is  e  use  >  biological  systems  to  probe  the  environment  presenting  many  alter- 

'r  8  011  Mure  b“  a  kw  ver>'  succ5ssftl1'  ™s  inefficient 

but  \ery  effective  way  of  improvement. 

Experiments  can  provide  a  certain  amount  of  knowledge  about  the  future.  In  some  sense,  probes 
are  mutations  in  small  scale  that  can  cause  only  small  losses.  The  results  give  insights  to  discover 
options  to  compete  in  the  future  and  stimulate  creative  thinking.  The  research  investment 
pays  dividends  when  a  new  way  of  competition  is  discovered  altering  the  status  quo’s  rules. 

When  the  changes  in  the  environment  occur  too  fast,  sensing  the  variables  becomes  more  difficult 
is  possible  that  a  specialized  organ  was  not  able  to  react  on  time  to  record  the  metric  and 
c  nsmi  t  e  a  ert.  n  this  case,  the  system  starts  to  lose  information  threatening  its  own  viability 
\\  hen  the  changes  in  the  environment  are  too  drastic,  even  if  the  sensor  organs  detect  the  change. 

e  inference  organs  may  not  be  able  to  deteimine  an  effective  course  of  action  because  they  do 
not  have  a  previous  experience,  or  because  the  decision-making  process  requires  more  time.  This 
situation  also  threats  the  viability  of  the  system  in  the  long  run.  The  effects  of  drastic  variations 
and  high  rate  of  change  over  systems  can  be  visualized  with  simple  experiments:  a)  increasing  the 
speed  of  transmission  in  a  communication  channel  beyond  some  limit  will  provoke  the  lost  of  part 

or  the  entire  message,  b)  modifying  the  pH  in  the  soil  beyond  a  certain  limit  can  cause  the  death  of 
a  plant. 

same  ^ndrome  can  rec°onized  in  an\  type  of  organization.  We  purpose  to  employ  a  new 
ra  eg\  Competing  on  the  Edge”  is  a  new  theory  definesstrategv  as  the  creation  of  a  relentless 
low  of  competitive  advantages  that,  taken  together,  form  a  semi-coherent  strategic  direction 
roun  &  Eisenhardt,  1999).  The  key  driver  for  superior  performance  is  the  ability  to  change 
reinventing  the  organization  constantly  over  the  time.  This  factor  of  success  can  be  applied  to 
software  engineering  as  well  as  to  other  decision  problems  with  similar  characteristics. 

If  the  environment  is  moving,  like  in  surfing,  the  best  way  to  remain  in  equilibrium  is  by  being  in 
i  S  1  uccess  1  corporations  such  as  Intel  or  Microsoft  are  in  perpetual  movement, 
launching  new  products  with  certain  rhythm.  Intel  is  faithful  to  its  founder’s  (Moore)  law:  the 

power  of  the  microprocessors  double  every  eighteen  months.  Microsoft  has  a  proportional  pace 
on  the  software  sector. 
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3.  The  organizational  issue 

The  second  unresolved  issue  is  organizational.  We  think  that  many  of  the  problems  on  current 
software  projects  have  organizational  roots.  This  opinion  is  also  supported  bv  (van  Genutchen 
1991)1  and  (Capers  Jones,  1994)2. 

Perrow  (Burton  et  al.,  1998),  introduced  a  two- 
dimensional  classification  of  the  technology 
(Fig.  1).  The  first  dimension  is  the  analyzability 
of  the  problem  varying  from  well  defined  to  ill 
defined.  The  second  dimension  is  the  task  vari¬ 
ability,  which  means  the  number  of  expected 
exceptions  in  the  tasks. 

In  our  view,  a  third  dimension  is  required  to 
model  the  dynamics  of  the  problem.  In  general, 
any  technological  scenario  will  change  its  ana¬ 
lyzability  and  its  variability  with  time.  This  is 
the  case  for  software  engineering  develop¬ 
ments.  During  the  initial  stages  the  problem  is 
ill-defined  and  many  exceptions  occur.  After 
several  evolution  cycles,  usually  comprising 
several  prototypes,  the  requirements  become  clear  and  the  problem  drift  gradually  into  the  engi¬ 
neering  quadrant.  In  figure  1,  the  gray  oval  represents  the  projection  of  the  software  problems  in  a 
two  dimensional  space. 

This  kind  of  scenarios  require  highly  skilled  personnel,  low  formalization  and  centralization,  high 
information  processing  demand,  and  coordination  obtained  through  meetings  is  required.  In  our 
opinion  software  engineering  is  not  the  only  discipline  in  this  quadrant.  The  challenges  imposed  by 
hyper  competition  create  similar  characteristics  than  in  software  engineering  developments.  So, 
the  mles  of  engagement  proved  effective  for  one  discipline  could  result  useftifin  the  other. 

A  second  line  of  research  (Burton  &  Obel,  1998),  introduced  a  classification  based  on  four- 
variable  model:  equivocality,  environmental  complexity,  uncertainty  and  hostility.  Equivocality  is 
<Y  he  existence  of  multiple  and  conflicting  interpretations",  it  is  a  measure  of  the  lack  of  knowl¬ 
edge  or  the  level  of  ignorance  whether  a  variable  exists  in  the  space.  Uncertainty  is  the  lack  of 
knowledge  about  the  likelihood  of  values  for  the  known  variables.  Environmental  complexity  is 
the  number  of  factors  in  the  environment  affecting  the  organization  and  their  interdependency. 
Finally,  hostility  is  the  level  of  competition  ctnd  how  malevolent  the  environment  is.  " 

In  Table  1,  we  disregard  the  fourth  variable:  hostility.  Hostility  is  a  discontinuity  of  the  environ¬ 
ment.  When  it  is  high,  then  it  ovemiles  other  factors.  In  highly  hostility  scenarios  only  a  highly 
centralized  organization  (‘‘regular  army’'),  or  a  low-formal-low-complex  organization  ("guerilla") 
are  the  possible  alternatives. 


Ill-defined 

Problem 

snalyzabillty 

Well-defined 


Figure  1:  Perrow’s  classification  of 
technology 


craft 

r>rn  routine 

Software 
\  i 

routine 

i 

engineering 

Few  exceptions  Mary  exceptions 
Ta  sfc  variably 


\  an  Gcnuchten  found  that  45%  of  all  the  causes  for  delayed  software  are  related  to  organizational  issues. 

Capers  Jones  found  that  on  military  software  developments  the  two  more  common  threats  are  excessive  paper¬ 
work  (90%  of  the  time)  and  low  productivity  (85°  o  of  the  time). 
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Software  development  scenarios  usually  correspond  to  high  equivocality,  high  environmental 
complexity  and  high  uncertainty  scenarios  (dark  gray  in  the  matrix),  which  correspond  to  low 
formalization  and  low  organizational  complexity,  with  centralization  inverse  to  the  environmental 
comp  exit> .  e  recommended  organization  could  be  ad  hoc  or  matrix  with  coordination  by  inte- 

c,rat0r  °r  croup  meeting.  The  information  exchange  is  rich  and  abundant.  The  incentive  policy 
should  be  based  on  results. 
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Table  1:  Burton  &  Obel  classification 


Understanding  these  organizational  characteristics  inherent  of  software  projects  is  required  to  cre¬ 
ate  a  more  fitted  software  process.  The  application  of  a  quasi-chaotic  process  keeps  the  organiza- 
non  in  continuous  movement  with  positive  effects  its  internal  behavior.  The  rhythmic  change 
avoids  manager’s  tendency  to  slow  down  the  process  or  introduce  changes  too  often.  The  periodic 
changes  create  small  amounts  of  chaos  that  maintain  the  organization  inthe  edge. 

4.  The  engineering  issues 

Despite  oO  years  of  progress,  the  software  industry  remains  immature  to  meet  the  demands  of  an 
information-age  economy.  Many  researches  have  treated  the  problem  using  different  approaches- 
formal  methods,  prototyping,  software  processes,  etc.  However,  the  problem  remains  open  today. 

e  t  lr  unresolved  issue  is  a  set  of  engineering  problems  concerning  software  processes,  risk 
assessment,  and  reuse. 

4. 1.  The  software  process  problem 

Studies  have  shown  that  early  parts  of  the  system  development  cycle  such  as  requirements  and 
esign  specification^  are  especially  prone  to  error  (Luqi.  1989).  Problems  originating  in  the  early 
stages  often  have  a  lasting  influence  on  the  reliability,  safety  and  cost  of  the  system.  This  effect  is 
particularly  notorious  in  projects  involving  multiple  stakeholders  with  different  points  of  view. 

volutionary  software  processes  offer  an  iterative  approach  to  requirement  engineering  to  allevi¬ 
ate  the  problems  of  uncertainty,  ambiguity  and  inconsistency  inherent  in  software  developments. 
Experience  suggests  that  building  and  integrating  software  by  mechanically  processable  formal 
models  leads  to  cheaper,  faster  and  more  reliable  products.  Moreover,  prototyping  can  improve 
the  capture  of  change  in  requirements  and  assumptions  during  the  development  process.  Proto- 
types  are  useful  to  demonstrate  system  scenarios  to  the  affected  parties  as  a  way  to:  a)  collect 
criticisms  and  feedback  that  are  sources  for  new  requirements;  b)  enable  early  detection  of  devia- 
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tions  from  users'  expectations;  c)  trace  the  evolution  of  the  requirements;  and  d)  improve  the 
communication  and  integration  of  the  users  and  the  development  personnel. 

Despite  the  unquestionable  benefits  of  evolutionary  software  processes,  we  have  some  concerns. 
The  first  concern  is  that  prototyping  poses  a  problem  to  project  planning  because  of  the  uncertain 
number  of  cycles  required  to  construct  the  product.  Most  project  management  and  estimation 
techniques  are  based  on  linear  layouts  of  activities,  so  they  do  not  fit  completely. 

Second,  evolutionary  software  processes  do  not  establish  the  maximum  speed  of  the  evolution.  If 
the  evolutions  occur  too  fast,  without  a  period  of  relaxation,  it  is  certain  that  the  process  will  fall 
into  chaos.  On  the  other  hand  if  the  speed  is  too  slow  then  the  productivity  could  result  affected. 
The  correct  rhythm  for  software  processes  has  not  been  researched  and  remains  on  the  hands  of 
the  project  manager. 

Third,  software  processes  should  be  focused  on  flexibility  and  extensibility  rather  than  in  high 
quality'.  This  assertion  sounds  scary.  However,  we  should  prioritize  the  speed  of  the  development 
over  zero  defects.  Extending  the  development  in  order  to  reach  high  quality  could  result  in  a  late 
delivery  of  the  product,  when  the  opportunity  niche  has  disappeared.  This  paradigm  shift  is  im¬ 
posed  by  the  competition  on  the  edge  of  chaos. 

4.2.  The  risk  assessment  and  estimation  problems 

Developing  software  is  still  a  high-risk  activity.  Despite  the  advances  in  technology  and  tools,  lit¬ 
tle  progress  has  been  done  in  improving  the  management  of  software  development  projects.  Part 
of  the  problem  is  misinterpretation  of  the  importance  of  risk  management  that  is  usually  viewed  as 
an  extra  activity  layered  on  the  assigned  work,  or  worst,  as  an  outside  activity  that  is  not  part  of 
the  software  process  (Hall.  1997),  (Karolak,  1996). 

Software  development  processes  such  the  hypergraph  model  for  software  evolution  (Luqi,  1989), 
or  the  spiral  model  (Boehm,  1988),  improved  the  state  of  the  an.  However,  all  of  them  have  a 
common  weakness:  risk  assessment. 

On  the  software  evolution  domain,  risk  assessment  has  not  been  addressed  as  part  of  the  model. 
In  the  various  enhancements  and  extensions,  the  graph  model  did  not  include  risk  assessment 
steps;  hence  risk  management  remains  as  a  human-dependent  activity  that  requires  expertise. 

On  the  evaluation  of  the  spiral  model,  one  of  the  difficulties  mentioned  by  Boehm  was:  "Relying 
on  risk-assessment  expertise,  the  spiral  model  places  a  great  deal  of  reliance  on  the  ability  of 
software  developers  to  identify  and  manage  sources  of  project  risk. . "  (Boehm,  1988). 

Many  researches  have  addressed  the  problem  of  risk  assessment  following  only  one  perspective. 
The  available  tools  for  risk  assessment  are  guidelines  for  practices,  checklists,  taxonomies  of  risk 
factors  and  few  metrics.  All  these  methods  work  fine  if  a)  there  is  a  human  educated  on  risk  as¬ 
sessment,  and  b)  he/she  has  enough  experience.  Such  resources  are  very'  scarce  and  it  is  difficult 
to  leverage  their  expertise  over  large  organizations. 

The  main  line  of  previous  research  has  addressed  the  problem  in  parallel  with  the  development 
process  using  informal  methods.  Basically  the  proposed  methodologies  are  lists  of  practices  and 
checklists  (SEI,  1996),  (Hall,  1997)  or  scoring  techniques  (Karolak,  1996)  that  are  dependent  on 
human  expertise. 
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,i,m  '  °nd  ''eakne’s  °u  "Sk  assessm'nt  is  causetl  by  the  difficulties  in  estimate  the  development 

nlTd  I  Hiff,  7  b5e"  us.m*  ,hrse  class«  of  tools  to  estimate  effort  and  time  that  can  be  ap- 

ous  onl^nr  mTen,S  life  cycle'  £ach  care8°0'  bei"S  m°"=  precise  than  the  prevd- 

ous  one  but  arriving  later  on  the  life  cycle: 

^  IS  ®Stimations-  Thlt^  cate§or>'  includes  very  crude  approximations  done  during  the  be- 
c  nninc  of  the  process  usually  by  subjective  comparisons  using  previous  projects. 

b)  Macro  models.  This  category'  includes  Basic  COCOMO,  COCOMO  II  (application  composi¬ 

tion  model)  Putnam,  Function  Points,  etc.  The  estimation  is  done  after  completing  the  re¬ 
quirements  phase.  F  - 

c)  Micro  models.  This  category  includes  intermediate  and  detailed  COCOMO,  COCOMO  II 
(early  design  and  post-architecture  models),  and  Pert'CPM/Gantt  techniques.  The  estimation 

done  after  the  design  when  it  is  possible  to  have  a  work  breakdown  structure.  The  project 

estimate  is  the  integration  of  all  module  estimates. 

in  ?lTh|.!ldriSmS7Q0n  f.Qof  te?niqUeS  is  0utside  the  sc°Pe  of  this  PaP^;  the  details  can  be  read 
9%  l  u}'  (BOehm'  1981  and  2000)’  (Londeix-  ,9S7>-  (patnam-  1980.  1992. 

jectS6’  ^  "7)’  N°ne  °f  thCSe  techniclues  consider  the  following  characteristics  of  software  pro- 

a)  Requirement  volatility 

b)  Personnel  volatility 

c)  Time  consumed  by  communications,  exceptions  and  noise  in  the  process.  All  the  methods  use 
size  as  an  input  parameter  via  some  kind  of  derivation  from  complexity.  In  many  cases  the 

lOOTwr0  C°mP!LSUCh  complexities  and  sizes  are  questionable  (Kitchenham.  1993  and 
1997),  (kemerer.  1993). 

Recently,  NPS  developed  a  formal  model  for  risk  identification  and  assessment  for  evolutionary 
of  u  are  processes  that  solves  the  problems  of  automation,  human  dependency,  and  estimation 
(i  ogueira  et  al.  _000).  This  research  is  focused  on  studying  software  project  risk  assessment  from 
‘  ,tter«nt  perspective,  viewing  risk  assessment  as  the  prediction  of  success  of  the  project  given  a 
set  of  characteristics,  a  probabilistic  model  based  on  Weibull  distribution,  and  learning  from  each 
successive  cycle  on  the  process. 

4.3.  The  reuse  problem 

Even  if  the  industpr  claims  for  the  use  of  flexible  and  extensible  architectures  from  which  reusable 
components  cou  e  integrated  as  a  way  of  generating  applications,  the  reality  is  that  the  stan- 
ar  does  not  exist.  Different  architectures  are  competing  for  becoming  the  de  facto  standard, 
i  lcrosoft  proposes  the  Distributed  network  Architecture  (DNA)  based  on  DCOM  and  ActiveX. 
Ua  air  °Jherf)MG  members  propose  the  Enterprise  Computing  Platform  (ECP)  based  on  HOP 
an  CORBA.  Each  alternative  presents  advantages  and  disadvantages  and  it  is  not  easy  to  fore¬ 
cast  the  winner.  w 
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5.  The  edge  of  chaos 


The  edge  of  chaos  is  a  natural  state  between  order  and  chaos,  a  grand  compromise  between 
stmctwe  and  surprise  (Kauffman,  1995).  Chaos  theory  describes  a  specific  range  of  irregular 
behaviors  in  systems  that  move  or  change  (James,  1996).  Chaotic  does  not  mean  random.  The  # 
primary  feature  distinguishing  chaotic  from  random  behavior  is  the  existence  of  one  ore  more  at¬ 
tractors.  Without  the  existence  of  such  attractors  the  quasi-chaotic  scenarios  could  not  be  repeat- 
able.  It  is  important  to  realize  that  a  chaotic  system  must  be  bounded,  nonlinear,  non-periodic  and 
sensitive  to  small  disturbances  and  mixing.  If  a  system  has  all  these  properties  can  be  driven  into 
chaos. 

W  e  have  the  tendency  to  think  that  the  order  is  the  ideal  state  of  nature.  This  could  be  a  big  mis¬ 
take.  Research  on  organizational  theory  (Stacey,  Nonaka,  Zimmerman);  management  (Stacey. 
Levy),  and  economics  (Arthur)  support  the  theory  that  operation  away  from  equilibrium  generates 
creativity,  self-organization  processes  and  increasing  returns  (Roos,  1996).  Absolute  order  means 
the  absence  of  variability;  consequently  this  behavior  could  be  very  dangerous  in  environments 
v.  ith  high  equivocality.  In  such  scenarios,  a  better  approach  could  be  a  restless  series  of  changes 
aiming  competitive  advantage  niches,  which  globally  form  a  semi-coherent  strategic  direction. 

Change  occurs  when  there  is  some  structure  so  that  the  change  can  be  organized,  but  not  so  rigid 
that  it  cannot  occur.  Too  much  chaos,  on  the  other  hand,  can  make  impossible  the  coordination 
and  coherence.  Lack  of  structure  does  not  always  mean  disorder.  Let  illustrate  this  idea  with  an 
example.  We  can  agree  that  there  is  little  structure  in  a  flock  of  migratory  ducks  in  a  lake.  How¬ 
ever,  few  minutes  after  they  start  flying  some  order  appear  and  the  flock  creates  a  V  shape  forma¬ 
tion.  This  self-organized  behavior  occurs  because  a  loose  form  of  structure  exists.  Experiments 
with  intelligent  agents  governed  by  three  rules  (a)  try  to  maintain  a  minimum  distance  from  the 
other  objects  in  the  environment,  including  other  agents;  b)  try  to  match  the  speed  of  other  agents 
in  the  vicinity;  and  c)  try  to  move  toward  the  perceived  center  of  mass  of  the  agents  in  the  vicin¬ 
ity),  showr  the  same  behavior.  Independently  of  the  starting  position  of  the  agents,  they  always  end 
up  in  a  flock.  Even  if  an  obstacle  disturbs  the  formation,  the  pseudo-order  is  recovered  some  time 
later.  This  self-organized  behavior  emerges  despite  the  absence  of  leadership  and  without  an  ex¬ 
plicit  order  to  form  a  flock. 

A  more  interesting  example  is  the  behavior  of  software  development  teams.  A  recent  article 
(Cusumano,  1997),  describes  the  strategies  of  Microsoft  to  manage  large  teams  as  small  teams. 
Dr.  Cusumano  says  " What  Microsoft  tries  to  do  is  allow  many  small  teams  and  individuals 
enough  freedom  to  work  in  parallel  yet  still  function  as  one  large  team,  so  the v  can  build  large- 
scale  products  relatively  quickly  and  cheaply.  The  teams  adhere  to  a  few  rigid  rules  that  enforce 
a  high  degree  of  coordination  and  communication . "  This  is  an  exact  description  of  the  emerging 
behavior  in  a  complex  adaptive  system.  It  is  self-adaptive  because  the  agents  realize  the  adjust¬ 
ment  to  the  environment,  and  it  is  emergent  because  it  arises  from  the  system  and  can  only  be 
partly  predicted.  As  in  the  example  of  the  ducks,  few'  rules  of  interaction  between  the  agents  (in 
this  case  people)  generate  a  performing  behavior.  The  three  rigid  rules  at  Microsoft  are:  a)  devel¬ 
opers  integrate  their  work  daily  forcing  the  synchronization  and  testing  of  the  work;  b)  developers 
responsible  for  bugs  must  fix  them  immediately,  and  are  responsible  for  the  next  day  integration: 
and  c)  milestone  stabilization  is  sacred. 
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Complex  adapti\e  systems,  as  the  one  just  described,  are  made  up  with  multiple  interacting 
agents.  The  emergence  of  the  complex  behavior  requires  three  conditions.  First,  it  is  required  the 
existence  of  more  than  one  agent.  Second,  the  agents  must  be  sufficiently  different  to  each  other 
such  that  their  behavior  is  not  exactly  the  same  in  all  cases.  When  agents  behave  exactly  the  same 
way  exhibit  predictable,  not  complex,  behavior.  Finally,  complex  adaptive  behavior  onlv  occurs  in 


6.  Some  of  the  risks  of  being  in  the  edge  of  chaos 

Limiting  the  structure  m  organizations  can  be  useful  in  situations  when  innovation  is  critical  or 
vhen  is  required  to  revitalize  bureaucracies.  However,  if  the  structure  is  debilitated  beyond  a  cer¬ 
tain  minimum.  it  can  conduct  to  an  undesired  state.  Some  traits  can  alert  the  eminence  of  such  an¬ 
archic  situation  known  as  the  "chaos  trap"  (Brown  &  Eisenhardt,  1999):  a)  emerging  of  a  ntle- 

breaking  culture,  b)  missing  deadlines  and  unclear  responsibilities  and  goals,  and  c)  random  com¬ 
munication  flows. 

On  the  other  hand  focusing  in  hierarchy  and  disciplined  processes,  emphasis  on  schedules,  plan¬ 
ning  an  job  description^  may  conduct  to  a  steady  inert  bureaucracy.  Organizations  in  such  state 
react  too  late  failing  to  capture  shifting  strategic  opportunities.  This  is  the  case  of  a  "bureaucratic 
trap  ,  where  there  are  also  some  observable  warning  traits:  a)  rule-following  culture,  b)  rigid 
structure,  tight  processes  and  job  definitions,  and  c)  formal  communication  as  the  only  channel.' 

Th„  alternate  e  is  surfing  the  edge  of  chaos  avoiding  both  attractors.  That  requires  limited  struc¬ 
ture  combined  with  intense  interaction  between  the  agents,  giving  enough  flexibility  to  develop 
surprising  and  adaptive  behavior.  Organizations  in  this  state  are  characterized  by  having  an  adap¬ 
tive  culture.  People  expect  and  anticipate  changes.  A  second  characteristic  is  that  the  few  key  ex¬ 
isting  structures  are  never  violated.  Finally,  real  time  communication  is  required  throughout  the 
entire  organization. 

Being  in  the  edge  of  the  chaos  implies  an  unstable  position.  Some  perturbations  can  cause  the  rup¬ 
ture  of  this  delicate  equilibrium  and  the  fall  into  one  of  the  two  steady  states.  A  potential  perturba¬ 
tion  factor  is  the  organizational  collaboration  style.  Too  much  collaboration  can  disturb  the  per¬ 
formance  of  each  agent  and  consequently,  the  whole  system  is  affected.  On  the  other  hand,  too 
little  collaboration  destroys  the  advantage  of  acting  organized  and  leads  to  paralysis. 

Another  sources  of  perturbation  are  the  tendency  to  be  tight  to  the  past  and  cultural  idiosyncrasy, 
or  y  contrary,  to  loose  the  link  with  the  past.  In  one  case,  the  change  becomes  impossible.  In  the 
other  case,  the  assets  from  previous  experiences  are  not  capitalized.  The  equilibrium  point  is 
ca  led  regeneration.  In  such  unstable  state,  mutation  can  occur.  Therefore  the  inherited  character- 
istics  that  give  competitive  advantage  in  a  certain  scenario  can  be  perpetuated,  and  new  variations 
are  introduced.  If  too  little  variation  exists,  natural  selection  fails.  This  process  permits  that  com- 
plex  adaptive  systems  change  over  the  time  following  a  Darwinian  pattern. 

(Kauffman,  1995)  introduced  the  concept  of  fitness  landscape.  We  can  understand  this  concept 
observing  the  behavior  of  species.  In  the  competition  for  survival,  species  attempt  to  alter  their 
genetic  make-up  by  taking  adaptation  trying  to  move  to  higher  "fitness  points"  where  their  viabil¬ 
ity  will  be  enhanced.  Species  that  are  not  able  to  reach  higher  points  on  their  landscapes  may  be 
outpaced  by  competitors  who  are  more  successful  in  doing  so.  If  that  occurs  the  risk  of  extinction 


increases.  The  same  principle  applies  between  predator  and  prey.  Each  development  in  the  abili¬ 
ties  of  one  species  generates  an  improvement  on  the  abilities  of  the  other.  This  concept  is  called 
co-evolution. 

Certain  higher  fitness  points  have  more  value  to  some  species  than  to  others.  The  contribution  a 
new  gene  can  make  to  a  species  fitness  depends  on  genes  the  species  already  has.  As  more  com¬ 
plicated  is  the  genetic  pattern  (more  evolved),  the  probability  of  conflict  of  a  new'  adaptation  in¬ 
creases  slowing  down  the  speed  of  variations. 

Natural  selection  is  an  effective,  but  not  generally  efficient  way  to  evolve.  The  process  requires 
some  amount  of  mutation  to  avoid  the  sudden  convergence  on  suboptimal  characteristics.  Some 
of  the  characteristics  lost  in  the  past  can  be  reintroduced  being  useful  in  the  new  scenario.  Many 
errors  are  committed  during  this  blind  process.  A  more  efficient  way  to  evolve  is  by  recombina¬ 
tion  of  the  pool  of  genes  using  genetic  algorithms.  This  technique  has  been  applied  to  improve  the 
performance  of  robots,  however  the  idea  can  be  used  to  improve  the  competencies  of  organiza¬ 
tions.  If  too  much  or  too  less  variation  occurs  the  result  always  conduct  to  the  failure  of  the  sys¬ 
tem. 


7.  Application  in  software  engineering 

Chaos  in  software  development  comes  from  various  sources:  a)  the  intrinsic  variable  nature  of 
requirements,  b)  the  changes  introduced  by  new  technologies,  c)  the  dynamics  of  the  software 
process,  and  d)  the  complex  nature  ot  human  interaction.  These  non-linear  characteristics  plus  the 
condition  of  edge  of  chaos  are  sufficient  for  the  development  of  complex  adaptive  systems  in 
w'hich  the  agents  are  collaborative  developer  teams. 

In  software  development  scenarios  equivocality,  environmental  complexity  and  uncertainty  are 
usually  high.  The  suggested  organizational  structure  to  deal  with  such  scenarios  (Burton  &  Obel, 
1998)  should  have  low  formalization  and  organizational  complexity,  centralization  inverse  to  the 
environmental  complexity,  and  rich  and  abundant  information  exchange.  The  recommended  or¬ 
ganization  should  be  ad  hoc  or  matrix,  with  coordination  by  integrator  or  group  meeting.  This  or¬ 
ganizational  style  is  difficult  to  achieve  when  the  organizations  are  large.  A  clear  solution  to  this 
problem  w'as  recognized  at  Microsoft  (Cusumano,  1997):  a)  parallel  developments  by  small  teams 
with  continuous  synchronization  and  periodically  stabilization,  b)  software  evolution  processes 
where  the  product  acquires  new  features  in  increments  as  the  project  proceeds  rather  than  at  the 
end  of  a  project,  c)  testing  conducted  in  parallel  as  part  of  the  evolution  process,  and  d)  focus 
creativity  by  evolving  features  and  ’’fixing"  resources.  Cusumano  observed  that  small  development 
teams  were  more  productive  because:  a)  fewer  people  on  a  team  have  better  communication  and 
consistency  of  ideas  than  large  teams,  and  b)  in  research,  engineering  and  intellectual  w'ork  indi¬ 
vidual  productivity  has  big  variance.  Software  development  requires  teamwork,  more  specifically 
organized  work.  So  we  require  understanding  the  dynamics  of  organizations  as  artificial  social 
entities  that  exist  to  achieve  a  specific  purpose,  in  this  case  to  develop  software.  Such  organiza¬ 
tions  are  made  up  of  individuals  w'ho  accomplish  diverse  desegresate  activities  that  require  coor¬ 
dination  and  consequently  information  exchange. 

A  shift  from  the  traditional  long-term  development  organizations  is  required.  Virtual  teams  cre¬ 
ated  as  temporary  dynamic  project-oriented  structures,  with  a  composition  of  skills  matching  ex- 
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actly  the  objectives  could  improve  the  current  performances.  Such  virtual  organizations  are  not 
exposed  to  bureaucratic  loads  and  do  not  require  to  absorb  the  cost  of  permanent  staff  (Sene- 
gupta  &  Jones.  1999). 

Larger  developments  could  be  achieved  by  parallel  projects  loosely  coupled  sharing  a  common 
architecture  such  CORBA  or  DCOM.  This  paradigm  enables  the  possibility  of  managing  large  de¬ 
veloping  organizations  as  if  they  were  small.  In  such  scenarios,  the  benefits  of  complex  adaptive 
systems  will  occur  at  two  levels.  First  at  the  micro  level,  that  is  inside  each  small  project,  where 
the  agents  are  individuals.  Second,  at  the  macro  level,  where  the  agents  are  parallel  collaborative 
projects. 

8.  Conclusion 

Complex  adaptive  systems  appear  as  the  most  attractive  way  to  deal  with  changing  environments. 
Besides  some  indicators  introduced  by  (Brown  &.  Eisenhardt,  1999),  the  academic  research  is  not 
mature  enough  to  assert  a  methodology  for  competition  on  the  edge.  Some  enterprises,  such  as 
Microsoft  and  Intel,  seem  to  have  discovered  and  applied  this  form  of  strategy  since  many  years 
ago,  but  little  information  have  permeated. 

\\  e  propose  a  drastic  change  in  the  software  processes  using  the  benefits  of  programming  in  the 
small  to  programming  in  the  large.  More  even,  we  state  the  quality-driven  paradigm  should  be 
rev  ised.  and  that  the  objective  should  be  shorter  delivery  times,  flexibility  and  expansibility. 

Despite  the  obvious  differences  in  terms  of  hostility,  we  found  several  similarities  between  war 
and  software  development  scenarios.  A  depth  research  is  required  to  evaluate  the  applicability  of 
this  theory  to  different  fields  in  which  uncertainty  is  a  key  factor  peace  keeping  operations,  joint 
C  I.  and  irregular  warfare. 
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ABSTRACT 

The  current  state  of  the  an  techniques  of  risk  assessment- 
rely  on  checklists  and  human  expertise.  This  constitutes  a 
weak  approach  because  different  people  could  arrive  at 
different  conclusions  from  the  same  scenario.  The 
difficulty  on  estimating  the  duration  of  projects  applying 
evolutionary  software  processes  contributes  to  add 
intricacy  to  the  risk  assessment  problem.  This  paper 
introduces  a  formal  method  to  assess  the  risk  and  the 
duration  of  software  projects  automatically.  The  method 
has  been  designed  according  the  characteristics  of 
evolutionary  software  processes.  We  introduce  a  set  ot 
metrics  to  measure  productivity,  requirement  volatility 
and  complexity.  We  construct  a  formal  method  based  on 
these  three  indicators  to  estimate  the  duration  and  risk  of 
evolutionary  software  processes.  The  approach  introduces 
benefits  in  two  fields:  a)  automation  of  risk  assessment 
and.  b)  early  estimation  method  for  evolutionary  software 
processes. 

Keywords 

Risk,  software  metrics,  estimation  models 

INTRODUCTION 

Despite  progress  in  formal  methods,  prototyping,  and 
evolutionary  software  processes,  risk  assessment  remains 
as  an  open  issue  dependent  on  human  expertise.  Software 
development  processes  such  the  hypergraph  model  for 
software  evolution  [15].  or  the  spiral  model  [3],  have  a 
common  weakness:  risk  assessment.  In  the  software 
evolution  domain,  risk  assessment  has  not  been  addressed 
as  part  of  the  model.  In  the  various  enhancements  and 
extensions,  the  graph  model  did  not  include  risk 
assessment  steps,  hence  risk  management  remains  as  a 
human-dependent  activity  that  requires  expertise.  In  the 
evaluation  of  the  spiral  model,  one  of  the  difficulties 
mentioned  by  Boehm  was:  "Relying  on  risk-assessment 
expertise,  the  spiral  model  places  a  great  deal  of  reliance 
on  the  ability  of  software  developers  to  identify  and 
manage  sources  of  project  risk. "  [3]. 

Many  researches  [9.  6,  20]  have  addressed  the  problem  of 
risk  assessment  following  guidelines.  checklists. 


taxonomies  of  risk  factors,  and  tew  metrics.  All  these 
methods  work  fine  if  a)  they  are  applied  by  a  human 
educated  on  risk  assessment,  and  b)  he  she  has  enough 
experience.  The  weakness  of  all  current  risk  assessment 
practices  is  human  dependency.  As  a  corollary,  risk 
assessment  could  not  be  consistent  because  different 
experts  could  arrive  at  different  conclusions  trom  the 
same  scenario. 

Our  research  is  focused  on  transforming  the  present  state 
of  the  art  about  risk  assessment  into  a  formal  method.  This 
paper  introduces  an  automated  and  formal  software 
project  risk  assessment  model,  based  on  early  metrics  and 
probabilities  designed  for  evolutionary  software 
processes. 

THE  PROBLEM 

Studies  have  shown  that  early  parts  of  the  system 
development  cycle  such  as  requirements  and  design 
specifications  are  especially  prone  to  error  [15].  Problems 
originating  in  the  early  stages  often  have  a  lasting 
influence  on  the  reliability,  safety  and  cost  of  the  system. 
This  effect  is  particularly  notorious  in  projects  involving 
multiple  stakeholders  with  different  points  of  view. 
Evolutionary  software  processes  offer  an  iterative 
approach  to  requirement  engineering  to  alleviate  the 
problems  of  uncertainty,  ambiguity  and  inconsistency 
inherent  in  software  developments.  Moreover,  prototyping 
can  improve  the  capture  of  change  in  requirements  and 
assumptions  during  the  development  process.  Prototypes 
are  useful  to  demonstrate  system  scenarios  to  the  affected 
parties  as  a  way  to:  a)  collect  criticisms  and  feedback  that 
are  sources  for  new  requirements:  b)  enable  early 
detection  of  deviations  from  users’  expectations:  c)  trace 
the  evolution  of  the  requirements:  and  d)  improve  the 
communication  and  integration  of  the  users  and  the 
development  personnel. 

Despite  the  unquestionable  benefits  of  evolutionary 
software  processes,  we  have  two  concerns.  First,  the 
automated  risk  assessment  issue  has  not  been  resolved.  It 
is  usually  viewed  as  an  extra  activity  layered  on  the 
assigned  work,  or  worst,  as  an  outside  activity  that  is  not 
part  of  the  software  process  [6.  9].  The  mam  line  of 
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previous  research  has  addressed  the  problem  in  parallel 
with  the  development  process  using  informal  methods. 
Basically  the  proposed  methodologies  are  lists  of  practices 
and  checklists  [20,  6]  or  scoring  techniques  [9]  that  are 
dependent  on  human  expertise. 

The  second  concern  is  that  prototyping  poses  a  problem  to 
project  planning  because  of  the  uncertain  number  of 
cycles  required  to  construct  the  product.  The  industry  has 
been  using  three  classes  of  tools  to  estimate  effort  and 
time  that  can  be  applied  at  different  moments  during  the 
life  cycle,  each  category  being  more  precise  than  the 
previous  one  but  arriving  later: 

a)  Very  early  estimations.  This  category  includes  very- 
crude  approximations  done  during  the  beginning  of 
the  process  usually  by  subjective  comparisons  using 
previous  projects. 

b)  Macro  models.  This  category  includes  Basic 
COCOMO,  COCOMO  11  (application  composition 
model).  Putnam.  Function  Points,  etc.  The  estimation 
is  done  after  completing  the  requirements  phase. 

c)  Micro  models.  This  category  includes  intermediate 
and  detailed  COCOMO.  COCOMO  II  (early  design 
and  post-arch  i  teem  re  models),  and  Pert  CPN1  Gantt 
techniques.  The  estimation  is  done  after  the  design 
when  it  is  possible  to  have  a  work  breakdown 
structure.  The  project  estimate  is  the  integration  of  all 
module  estimates  based  on  linear  layouts  of  activities, 
so  they  do  not  fit  completely  with  evolutionary 
software  processes. 

A  detailed  discussion  of  these  techniques  is  outside  the 
scope  of  this  paper:  the  details  can  be  read  in  [1,  2.  4.  6. 
14.  16,  17.  IS.  19].  None  of  these  techniques  consider  the 
following  characteristics  of  software  projects:  a) 
requirement  volatility,  b)  personnel  volatility,  and  c)  time 
consumed  by  communications,  exceptions  and  noise  in  the 
process.  All  the  methods  use  size  as  an  input  parameter 
via  some  kind  of  derivation  from  complexity.  In  many 
cases  the  methods  to  compute  such  complexities  and  sizes 
are  questionable  [10,  11.  12]. 

METRICS 

In  this  section  we  describe  a  small  set  of  metrics  that 
support  our  risk  identification  strategy  (requirements, 
personnel  and  complexity).  We  choose  metrics  presenting 
the  following  characteristics:  a)  robustness,  b) 
repeatability,  c)  simplicity  in  terms  of  the  number  of 
parameters,  d)  easy  to  calculate,  and  e)  automatically 
collectable. 

Metrics  for  requirements 

We  purpose  three  metrics  for  requirements:  a)  birth-rate, 
b)  death-rate,  and  c)  change-rate.  We  define  birth-rare 
(BR)  as  the  percentage  of  new  requirements  incorporated 
in  each  cycle  of  the  evolution  process.  This  metric  shows 
the  introduction  of  new  requirements  as  a  percentage. 

We  define  death-rate  (DR)  as  the  percentage  of 
requirements  that  are  dropped  by  the  customer  in  each 
cycle  of  the  evolution  process. 

We  define  change-rate  (CR)  as  the  percentage  of 
requirements  changed  from  the  previous  cycle. 


From  the  point  of  view  of  the  metrics,  a  change  in  a 
requirement  can  be  viewed  as  a  death  of  the  old  version 
and  a  birth  of  the  new  one.  The  simplification  just 
described  enables  comparison  of  birth-rate  and  death-rate 
in  a  bi-dimensiona!  plot  that  shows  four  regions:  stability 
region,  growing  region,  volatility  region  and  shrinking 
region  (fig.  1).  Each  of  these  regions  has  different  risk 
connotations.  The  arrow  shows  the  normal  evolution  of  a 
project  as  time  goes  by.  During  early  stages,  it  is  normal 
for  projects  to  be  in  the  growing  region.  However,  if  the 
project  remains  in  this  region  after  many  cycles,  or  returns 
to  this  region  after  visiting  other  regions,  something 
wrong  happens.  The  first  case  is  an  indicator  that  the 
requirement  engineering  is  not  efficient:  hence  some 
corrective  action  should  be  applied.  The  second  case 
shows  evidence  of  late  discovery  of  some  cluster  of 
hidden  requirements. 

After  some  cycles,  the  project  should  be  in  the  volatile 
region.  If  the  project  does  not  evolve  into  the  stability 
region,  then  there  is  evidence  that  the  requirements 
engineering  activity  is  not  efficient  and  some  corrective 
action  may  be  needed.  It  is  important  to  analyze  the 
evolution  of  the  stakeholders’  issues  and  criticisms.  It 
could  be  also  the  case  that  stakeholders  have  changed  their 
minds.  If  the  project  evolves  to  the  shrinking  region,  and 
the  requirements  engineering  is  working  right,  there  is 
evidence  that  the  customers  are  cutting  down  the  project. 
This  can  be  an  indicator  of  a  severe  cut  in  the  budget. 
Finally,  any  return  to  a  previous  region  should  be 
considered  as  evidence  of  threats.  In  such  cases  a  detailed 
analysis  is  required  to  assess  the  causes  of  the  anomaly. 
This  set  of  metrics  can  be  collected  automatically  from  the 
baseline  and  can  give  early  alerts  of  threats.  In  our 
schema,  requirement  volatility  is  related  to  two  risk 
factors:  the  product  and  the  process. 
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Figure  I:  Evolution  of  requirements 
Metrics  for  fitness 

We  require  measure  the  fit  between  people  and  their  roles 
in  the  software  process.  In  order  to  measure  personnel 
both  quantitative  and  qualitative  metrics  are  required.  A 
skill  match  between  person  and  job  is  required  to  estimate 
the  speed  in  processing  information  and  rate  of 
exceptions.  On  the  quantitative  side  it  is  important  to 
measure  the  number  of  people  and  the  turnover.  This  last 
one  provides  information  about  the  expected  productivity 
losses  due  to  training.  learning  curves  and 
communications.  This  set  of  metrics  is  difficult  to  collect 
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because  people  are  very  reluctant  to  being  measured. 
During  the  simulations  we  found  that  there  exists  an  easier 
way  to  measure  the  productivity  fitness  observing  the  ratio 
between  direct  working  time  and  idle  time  as  we  will 
discuss  in  6.1.  Fitness  is  related  to  two  risk  factors:  the 
resources  and  the  process. 

Metrics  for  complexity 

Complexity  has  a  direct  impact  on  quality'  because  the 
likelihood  that  a  component  fails  is  directly  related  to  its 
complexity.  The  quality  of  the  product  can  only  be 
determined  at  the  end  of  the  process.  Hence,  it  is 
important  to  measure  the  complexity  as  an  early  predictor 
to  provide  a  way  to  assess  the  duration  of  the  project  ttiven 
some' indicators  collected  during  the  requirements  phase. 
In  such  conditions,  code  is  not  available,  so  the  onlv 
possible  measurements  should  come  from  the 
specification.  Complexity  is  related  to  one  risk  factor:  the 
product. 

Research  on  Function  Points  (FP)  [1.  2]  showed  that  there 
exists  a  clear  relation  between  complexity  and  size  in 
terms  of  lines  of  code.  However.  FP  are  not  well  suited  for 
real  time  systems  or  object-oriented  development*  [10  11 
12]. 

Forma!  specifications  are  suitable  for  being  analyzed  to 
compute  their  complexity.  We  conducted  experiments 
trying  to  derive  complexity  from  formal  specifications 
created  by  CAPS  (Computer  Aided  Prototyping  System) 
[15].  The  too!  generates  specifications  in  a  structured 
language  called  Prototyping  Specification  Design 
Language  (PSDL).  PSDL  code  has  the  following 
components:  types,  operators,  data  streams  and 
constraints.  Types  are  declarations  of  abstract  data  tvpes 
required  for  the  system.  Operators  are  state  machines  and 
data  streams  represent  the  communication  links  between 
them.  Both  operators  and  data  streams  are  the  components 
of  a  dataflow  graph.  Finally,  constraints  represent  the  real¬ 
time  constraints  that  the  system  must  support.  The  too! 
generates  Ada  code  form  PSDL  specifications. 

We  defined  two  complexity  metrics  for  PSDL:  a)  Fine 
Granularity  Complexity  metric  (FGC).  and  b)  Large 
Granularity  Complexity  metric  (LGC).  The  reason  to 
compute  different  metrics  is  because  we  want  to  detect 
two  classes  of  threats.  First,  we  need  to  be  aware  of 
excessively  complex  operators.  High  complexity  of  one 
operator  could  be  caused  by  poor  design  and  possibly  can 
be  solved  by  further  decomposition.  Second,  we  require  a 
metric  to  compute  the  total  complexity  of  the  system. 

FGC  expresses  the  complexity’  of  each  operator  in  the 
system  and  is  the  sum  of  the  fan-in  and  fan-out  data 
streams  related  to  the  operator  (FGC  =  fan-in  +  fan-out). 
LGC  expresses  the  complexity*  of  the  system  as  a  function 
of  the  number  of  operators  (O),  data  streams  (D),  and 
types  (T)(LGC  =  0  +  D-T). 

We  found  a  strong  correlation  between  PSDL  lines  or' 
code  and  LGC  (R  -  0.996.  fig.  2).  If  we  compare  the  Ada 
non-comment  lines  of  code  of  the  projects  with  their 
complexity*  measured  using  LGC.  we  observe  a  strong 
correlation  also  (R  =  0.898.  fig. 3).  Our  complexity  metric 
correlates  better  with  PSDL  than  with  Ada  because  CAPS 


automatically  generates  PSDL;  on  the  other  hand,  even  if 
CAPS  generates  part  of  the  Ada  code,  the  designer  can 
add  and  modify  the  generated  code,  introducing  more 
variability*.  The  size  of  the  project  in  thousands  of  non¬ 
comment  lines  of  code  can  be  estimated  as: 

KLOC  =  (32  LGC  -  150)  1000  [Eq.  1] 

As  the  complexity  grows,  the  ratio  trends  to 
approximately  32  LOC  for  each  unit  of  LGC.  This  finding 
provided  us  with  a  method  to  compute  the  size  of  the 
projects  given  an  early  measure  of  their  complexity*.  This 
conversion  is  required  to  compare  our  approach  with 
Putnam's  and  Boehm's  approaches  because  they  require 
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Figure  2:  Correlation  between  PSDL  and  LGC 
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the  size  as  an  input  parameter.  A  caveat  of  this  study  is 
that  our  sample  is  small,  but  it  includes  all  the  information 
we  have  at  the  current  time.  However,  the  study  suggests 
the  possibility'  of  estimating  size  in  terms  of  complexity' 
with  a  useful  degree  of  accuracy. 


THE  RISK  ASSESSMENT  MODEL 
A  probability  distribution  from  the  Weibull  family  can  be 
used  to  model  the  development  time  given  the  risk  factors 
discussed  above.  The  probability  density  function  and 
cumulative  density  function  for  the  model  are: 
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where: 

a)  x  is  the  random  variable  under  study.  In  our  case,  x 
can  be  interpreted  as  development  time. 


b)  a  is  a  shape  parameter.  It  determines  the  width  of  the 
peak  of  the  distribution  and  the  expected  error.  We 
can  associate  this  behavior  with  the  efficiency  of  the 
project,  which  depends  on  characteristics  of  the 
process  and  the  resources. 

c)  P  is  a  scale  parameter  that  stretches  or  compresses  the 
graph  in  the  .t  direction  and  hence  controls  the 
thickness  of  the  tail.  This  parameter  models  the  extra 
work  introduced  by  new  requirements  or  changes  in 
requirements. 

d)  Note  that  the  functions  start  at  x  =  0.  We  require  a 
third  parameter  to  shift  the  curves  to  the  right.  For 
that  reason  we  introduce  a  location  parameter  y. 
which  is  function  of  the  already  discovered  system 
complexity. 

CALIBRATION  OF  PARAMETERS 

To  calibrate  productivity  (a)  and  requirement's  voiatilitv 
(P).  we  conducted  simulations  with  ViteProjeet  [8.  13] 
using  the  following  scenarios  (fig.  4).  Each  scenario  name 
consists  of  three  letters  describing  the  value  for  each  of 
the  three  variables  under  study:  productivity  (cO. 
requirements’  voiatilitv*  (p).  and  complexity  (y).  Each 
letter  could  have  two  values:  high  (H)  or  low  (L).  The  too! 
was  configured  to  run  100  simulations  for  each  scenario, 
and  the  organizational  parameters  were  set  to  match  the 
characteristics  of  software  development. 
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Figure  4:  Scenario's  characteristics 

To  analyze  the  effect  of  productivity,  we  compared  the 
results  of  the  simulations  of  the  following  scenarios:  LLL 
vs  HLL.  LLH  vs  HLH.  LHL  vs  HHL.  and  LHH  vs  HHH. 
Wc  found  that  for  high  productivity  scenarios  (Hxx)  the 
development  time  improved  by  60%. 

To  analyze  the  effect  of  requirement  volatility,  we 
compared  the  results  of  the  simulations  of  the  following 
scenarios:  LLL  vs  LHL  LLH  vs  LHH.  HLL  vs  HHL  and 
HLH  vs  HHH.  We  found  that  high  requirement  volatility 
(xHx)  degraded  the  development  time  by  20%. 

To  analyze  the  effect  of  complexity,  we  compared  the 
results  of  the  simulations  of  the  following  scenarios:  LLL 
vs  LLH.  LHL  vs  LHH.  HLL  vs  HLH.  and  HHL  vs  HHH. 
We  found  that  high  complexity  (xxH)  degrade  the 
development  time  by  30%. 

6.1  Productivity  (a) 

Literature  in  productivity  classifies  time  spent  at  work  into 
four  categories: 

a)  Direct.  Time  spent  working  and  correcting  errors  on 
the  product.  In  ViteProjeet  terminology,  it  is  the  sum 
of  work  and  rework. 


b)  Indirect.  Time  spent  in  activities  supporting  the  work 
such  as  meetings,  coordination.  information 
exchanges,  etc.  In  ViteProjeet  terminology,  it  is 
known  as  coordination  time. 

c)  Idle.  Time  spent  without  work  to  do.  waiting  for  some 
input.  In  ViteProjeet  terminology,  it  is  known  as 
waiting  time. 

d)  Personal.  Time  spent  doing  anything  except  the  other 
categories.  ViteProjeet  does  not  compute  this 
category  of  time.  However,  it  is  loosely  related  to  the 
noise  parameter. 

If  we  examine  the  time  distribution  of  these  categories  we 
can  observe  a  remarkable  pattern  that  differentiates  high 
productivity  scenarios  from  the  low  productivity  ones. 
This  effect  is  independent  of  the  other  two  variables  of  the 
simulation.  Hence,  this  suggests  that  the  time  distribution 
can  be  a  good  indicator  for  the  parameter  a. 

Figure  o  presents  the  distribution  times  for  the  eight 
scenarios  simulated.  A  pattern  of  time  distributions  can  be 
clearly  observed.  Scenarios  with  low  productivity  have  a 
percentage  of  idle  time  greater  than  13%  of  the  total 
development  time. 

We  can  recognize  low  productivity  scenarios  also  by  the 
ratio  ot  the  percentage  of  direct  time  over  percentage  of 
idle  time,  which  we  call  productive  ratio  (PR): 


PR  =  a  -  Direct0-)  '  Idle%  [Eq.  5] 

For  high  productivity  scenarios  2.0  <  PR  <  6.0.  and  for 
low  productivity  scenarios  0.8  <  PR  <  2.0. 


Figure  5:  Time  distribution  from  each  scenario 


We  observed  that  using  PR  as  the  value  of  a.  the  model 
behaves  as  the  simulations.  That  is  on  high  productivity 
scenarios  the  total  development  is  60%  shorter  than  in  low 
productivity  ones.  The  reasons  why  the  ratio  PR  is  related 
to  productivity  require  further  study.  However,  we 
conjecture  the  reason  could  be  related  to: 

a)  Fit  of  job  and  people  skills. 

b)  People  turnover,  generating  noise  and  productivity- 
losses  derived  from  training  and  learning  curves. 

c)  Number  of  people,  influencing  the  productivity-  by 
excess  or  default  of  working  force. 

In  the  model  the  use  of  a  ranging  from  0.8  (low 
productivity)  to  6  (highest  productivity),  corresponds  to 
the  results  observed  in  the  simulations. 
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6.2  Requirement's  volatility  ((3) 

P,  the  extra  delay  factor  caused  by  requirements'  volatility 
(late  requirements  and  changes  in  previous  requirements), 
is  obtained  by  the  following  formula: 

P  =  INT((BR  -r  DR)  /  10)  [Eq.  6] 

Our  simulations  showed  a  20%  increase  on  the 
development  time  when  the  requirement's  volatility  is 
high. 

6.3  Complexity  (y) 

Having  found  a  complexity  metric  suited  for  our  purpose, 
the  next  step  was  to  find  for  the  existence  of  some  sort  of 
relationship  between  LGC  and  development  time. 

We  conducted  a  simple  experiment  using  the  conversion 
ratio  [Eq.  1]  to  obtain  the  size  inputs  for  the  sample.  We 
used  sample  points  from  1000  LGC  to  30000  LGC.  which 
means  sample  projects  from  32  K.LOC  to  almost  1MLOC. 
We  compute  the  average  estimation  for  the  development 
time  using  COCOMO  and  Putnam.  The  sample  points  are 
plotted  with  a  smoothing  thick  line.  The  logarithmic 
trendline  is  plotted  as  a  thin  red  line.  We  found  a  strong 
logarithmic  correlation  (R:  «  0.9699)  with  the  following 
function  (Fig.  6). 

Time  (months)  =  y-  13  Ln(LGC)  -  82  [Eq.  7] 
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Figure  6:  Complexity-time  correlation 

This  equation  gives  a  conservative  estimation  for  projects 
between  4000  and  20000  LGC  (128  and  640  KLOC  of 
Ada).  The  estimation  seems  to  be  too  optimistic  for 
projects  smaller  than  2000  LGC  or  greater  than  25000 
LGC.  Figure  9  shows  the  effects  of  complexity  over 
different  scenarios.  The  development  time  increases  by 
20%  when  the  complexity  is  high. 

6.4  The  complete  model 

Our  model  requires  three  parameters  (a.  p.  y)  that  can  be 
derived  from  metrics  automatically  collected  from  the 
development  environment  (Eq.  5,  6  and  7).  If  the 
development  environment  does  not  have  the  functionality 
to  collect  metrics,  then  a  manual  procedure  could  provide 
the  data.  Using  these  values  in  Eq.3  we  obtain  the 
probability  of  finishing  the  project  at  any  given  time  (x  in 
months)  (Fig.  7).  The  model  enables  to  refine  the 
estimation  form  the  knowledge  captured  at  each 


evolutionary  cycle.  As  the  development  progress  y 
increases  (known  complexity)  and  p  decreases  (less  tail). 

CONCLUSION 

We  introduced  a  formal  method  for  risk  assessment  that 
solves  the  issue  of  human  dependency,  characteristic  of 
the  current  risk  assessment  methodologies.  This  method  is 
supported  by  a  small  set  of  metrics  that  can  be 
automatically  collected  from  the  development 
environment. 

One  of  the  metrics  introduced,  productivity  ratio, 
constitutes  an  objective  method  to  assess  the  productivity' 
level  of  an  organization  without  subjective  judgement  of 
expens. 


— ?rob(finish  a:  x) 


Figure  7:  Distribution  functions 

We  introduced  a  complexity  metric  well  suited  for  real¬ 
time  systems  that  has  strong  correlation  with  development 
time.  Although,  this  metric  was  developed  specifically  for 
PSDL.  the  method  can  be  generalized  for  other 
methodologies  using  Object  Points  or  number  of  classes 
instead  of  LGC. 

An  interesting  side  effect  of  the  model  is  that  provides  an 
easy  way  to  estimate,  very  early  in  the  life  cycle,  the 
duration  of  a  project,  and  indirectly,  its  cost.  This  method 
enables  an  earlier  assessment  of  the  duration  of  the  project 
and  solves  the  problems  of: 

a)  Human  dependency  on  risk  assessment,  and 

b)  Difficulties  in  estimating  time  on  evolutionary 
prototyping  software  processes. 

Further  research  is  required  to  generalize  the  method  for 
larger  systems  and  for  different  domains. 
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Abstract 

Software  prototyping  processes  have  contributed  to  de¬ 
velop  cheaper,  faster  and  more  reliable  products.  However, 
despite  the  advances  in  technology,  little  pntgress  has  been 
done  in  impnwing  the  management  o  f  software  prototyping 
development  projects.  Research  shows  that  45  percent  of 
all  the  causes  for  delayed  software  deliveries  are  related  to 
organizational  issues  / //.  This  paper  addresses  the  risk 
assessment  issue,  introducing  metrics  and  a  model  that  am 
be  integrated  with  prototyping  development  pmc  esses. 


1.  Introduction 

Despite  50  \ears  of  progress,  the  software  industry  re¬ 
mains  immature  to  meet  the  demands  of  an  information- 
age  economy.  Many  researches  have  treated  the  problem 
using  different  approaches:  formal  methods,  prototyping, 
software  processes,  etc.  However,  this  assertion  remains 
true  today.  Experience  suggests  that  building  and  integrat¬ 
ing  software  by  mechanically  processable  formal  models 
leads  to  cheaper,  faster  and  more  reliable  products  [2]. 
Software  development  processes  such  the  hypergraph 
model  for  software  evolution  [2].  or  the  spiral  model  [3], 
hu\c  improved  the  state  of  the  art.  However,  they  have  a 
common  weakness:  risk  assessment.  On  the  software  evolu¬ 
tion  domain,  risk  assessment  has  not  been  addressed  as  part 
of  the  model.  In  the  various  enhancements  and  extensions, 
the  graph  model  did  not  include  risk  assessment  steps, 
hence  risk  management  remains  as  a  human-dependent 
activity  that  requires  expertise.  On  the  evaluation  of  the 
spiral  model,  one  of  the  difficulties  mentioned  by  Boehm 
was:  ” Relying  on  risk-assessment  expertise .  the  spiral 


model  places  a  great  deal  of  reliance  on  the  ability  of  Soft¬ 
ware  developers  to  identify  and  manage  sources  of  project 
risk. "  [3]. 

Many  researches  have  addressed  the  problem  of  risk  as¬ 
sessment  following  the  perspective  of  the  traditional  disci¬ 
plines.  The  available  n>oI>  for  risk  assessment  are  guide¬ 
lines  for  practices,  checklists,  taxonomies  of  risk  factors 
and  few  metrics.  All  these  methods  work  fine  if  a)  there  is  a 
human  educated  on  risk  assessment,  and  bi  hc/she  has 
enough  experience.  Such  resources  are  very  scarce.  Our 
research  is  focused  cm  software  project  risk  assessment, 
which  in  other  words  is  the  prediction  of  success  of  the  pro¬ 
ject.  The  only  way  to  evaluate  the  degree  of  success  of  a 
project  is:  a)  to  compare  the  planned  and  actual  schedules: 
b)  to  compare  the  planned  and  actual  costs;  and  c)  to  com¬ 
pare  the  planned  and  actual  privJuct  characteristics.  An 
emergent  branch  of  software  engineering  has  covered  this 
last  part:  software  reliability.  However,  vve  think  that  more 
emphasis  put  on  in  the  first  two.  Wc  believe  that  evolution¬ 
ary  prototyping  provides  the  most  promising  context  to 
address  these  issues. 

LI.  Impact  of  evolutionary  software  processes 

Studies  have  shown  that  early  parts  of  the  system  devel¬ 
opment  cycle  such  as  requirements  and  design  specifica¬ 
tions  are  especially  prone  to  errors  [2].  Problems  originat¬ 
ing  in  the  early  stages  often  have  a  lasting  influence  on  the 
reliability,  safety  and  cost  of  the  system.  This  effect  is  par¬ 
ticularly  notorious  in  projects  involving  multiple  stake¬ 
holders  with  different  points  of  view.  Evolutionary  proto¬ 
typing  offers  an  iterative  approach  to  requirement  engineer¬ 
ing  to  alleviate  the  problems  of  uncertainty,  ambiguity  and 
inconsistency  inherent  in  the  process.  Moreover,  prototyp- 
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ins  can  improve  the  capture  of  change  in  requirements  and 
assumptions  during  the  development  process. 

Evolution-driven  CASE  tools  for  computer-aided  proto- 
t\pine  provide  logical  assessment  of  the  consistency  and 
clarity  of  requirements  and  specifications.  The  use  of  proto- 
tvpes  facilitates  the  requirement  phase  in  any  type  of  soft¬ 
ware  projects.  Particularly,  in  real-time  applications  where 
severe  time  constraints  impose  more  challenges,  the  use  of 
prototypes  facilitates  to  describe  the  requirements  in  a 
clear,  precise,  consistent  and  executable  format.  Prototypes 
are  useful  to  demonstrate  system  scenarios  to  the  affected 
parties  as  a  way  to:  a)  collect  criticisms  and  feedback  that 
are  sources  for  new  requirements:  b)  early  detection  of  de¬ 
lations  from  users  expectations:  c)  trace  the  evolution  of 
the  requirements:  and  d»  improve  the  communication  and 
integration  of  the  users  and  the  development  personnel. 

Despite  the  unquestionable  benefits  of  prototvpine.  we 
have  two  concerns.  First,  the  risk  assessment  issue  has  not 
been  solved.  The  second  concern  is  that  prototyping  poses  a 
problem  to  project  planning  because  of  the  uncertain  num¬ 
ber  of  cycles  required  constructing  the  product.  Most  parts 
of  project  management  and  estimation  techniques  are  based 
on  linear  layouts  of  activities,  so  they  do  not  fit  completely. 

1.2.  The  estimation  problem 

In  order  to  assess  the  ri>k  in  a  project,  it  is  necessary  to 
have  an  idea  of  the  effort  and  time  involved.  The  industry 
has  been  using  three  classes  of  tools  to  estimate  effort  and 
time  that  can  be  applied  at  dillerent  moments  during  the 
hte  cycle,  each  categorv  being  more  precise  than  the  previ¬ 
ous  one  hut  arriving  later: 

a)  Very  early  estimations.  This  category  includes  very 
crude  approximations  done  during  the  beginning  of  the 
process  usually  by  subjective  comparisons  using  previ¬ 
ous  projects. 

b)  Macro  models.  This  category  includes  Basic 
COCOMO.  Putnam.  Function  Points,  etc.  The  estima¬ 
tion  is  done  alter  completing  the  requirements  phase. 

e)  Micro  models.  This  category  includes  intermediate  and 
detailed  COCOMO.  and  Pert/CPM/Gantt  techniques. 
The  estimation  is  done  after  the  design  when  it  is  pos¬ 
sible  to  have  a  work  breakdovvn  structure.  The  project 
estimate  is  the  integration  of  all  module  estimates. 

It  is  not  our  intention  to  discuss  these  techniques,  the  de¬ 
tails  can  be  read  in  [4],  [5].  [6]  and  [7].  However  we  high¬ 
light  the  assumptions  for  COCOMO  and  Putnam  s  meth¬ 
ods.  COCOMO  assumes: 

(1)  Thc  development  period  starts  at  the  beginning  of  the 
design  phase.  That  means  that  the  requirements  phase 
is  already  done. 


(2)  The  estimation  covers  only  the  direct-charged  labor.  In 
other  words,  time  spent  in  meetings  and  communica¬ 
tion  is  not  considered. 

(3)  The  model  assumes  that  a  rather  optimistic  workine- 
time  of  152  hours  of  productive  work  per  month. 

(4)  The  model  assumes  that  the  project  will  enjoy  "good 
management.” 

(5)  Finally,  the  model  assumes  that  the  requirements  will 
remain  unchanged.  This  is  a  really  restrictive  assump¬ 
tion  that  does  not  match  the  evolutionary  prototyping 
process. 

The  other  de  facto  standard,  Putnam’s  model,  is  based  on 
the  following  assumptions: 

(1)  A  development  project  is  a  finite  sequence  of  purpose¬ 
ful.  temporally  ordered  activities,  operating  on  a  ho¬ 
mogeneous  set  of  problem  elements,  to  meet  a  specified 
set  of  objectives. 

(2;  The  number  of  problem  elements  is  unknown  but  fi¬ 
nite. 

(j)  Problems  are  detected,  recognized  and  solved  bv  apply¬ 
ing  effort. 

M)  The  occurrence  of  problem  solving  follows  a  Poisson 
process. 

(5)  The  number  ot  people  working  in  the  project  is  propor¬ 
tional  to  thc  number  of  problems  ready  to  solve  at  that 
time. 

(6)  The  requirements  are  done,  which  is  verv  restrictive 
considering  evolutionary  software  processes. 

None  of  these  techniques  consider  the  following  charac¬ 
teristics  of  software  projects:  a)  requirement  volatility,  b) 
personnel  volatility,  and  c)  time  consumed  by  communica¬ 
tions.  exceptions  and  noise  in  thc  process.  All  the  methods 
use  size  as  input  parameter  via  some  kind  of  derivation 
from  complexity.  In  many  cases  the  methods  to  compute 
such  complexities  and  sizes  are  questionable.  Recently, 
Stanford  University  [7]  developed  a  new  generation  micro¬ 
model  estimation  tool  (ViteProject)  that  addresses  some  of 
our  concerns.  This  tool  is  useful  but  requires  a  complete 
work  breakdown  of  the  project,  thus  it  is  useful  to  control 
the  project  but  cannot  be  used  for  early  estimations.  How¬ 
ever,  it  is  very  useful  to  simulate  different  scenarios.  We  are 
using  this  approach  to  calibrate  our  model. 

2.  Metrics 

Metrics  is  a  key  factor  in  the  identification  of  threats. 
Without  metrics  it  is  not  possible  to  provide  early  alerts  of 
risks.  In  this  section  we  describe  a  set  of  metrics  that  sup¬ 
port  our  risk  identification  strategy.  We  decided  to  use  a 
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small  set  of  metrics  presenting  the  following  characteris¬ 
tics:  a)  robustness,  b)  repeatability,  c)  simplicity  in  terms  of 
the  number  of  parameters,  d)  easy  to  calculate,  and  e) 
automatically  collectable. 

2.1.  Metrics  for  Requirements 

We  define  birth  rate  (BR)  as  the  percentage  of  new  re¬ 
quirements  incorporated  in  each  cycle  of  the  evolution 
process.  This  metric  shows  the  explosion  of  new  require¬ 
ments  as  a  percentage. 

BR  =  (NR  /  TR)  *  100.  where  (Eq.  1 ) 

NR  =  number  of  new  requirements, 

TR  =  total  number  of  requirements  (including  NR). 

We  define  death  rate  (DR)  as  the  percentage  of  re¬ 
quirements  that  are  dropped  by  the  customer  in  each  cycle 
of  the  evolution  process. 

DR  =  (DciR/  TR)  *  100.  where  {Eq.  2.) 

DcIR  =  number  of  requirements  deleted. 

TR  =  total  number  of  requirements  (before  deletion) 

We  define  change-rate  (CR)  us  the  percentage  of  re¬ 
quirements  changed  from  the  previous  version. 

CR  =  (MckIR  /  TR)  *  100.  where  (Hq.  3) 

ModR  =  number  of  requirements  changed. 

TR  =  total  number  of  requirements. 


death-rate 

Figure  1:  Evolution  of  requirements 

From  the  point  of  view  of  the  metrics,  a  change  on  a  re¬ 
quirement  can  be  viewed  as  a  death  of  the  old  version  and  a 
birth  of  the  new  one.  The  simplification  just  described,  en¬ 
ables  to  compare  birth  rate  and  death  rate  in  a  bi- 
dimensional  plot  that  shows  four  regions:  stability  region, 
growing  region,  volatility  region  and  shrinking  region. 
Each  of  these  regions  has  different  risk  connotations.  There 
is  a  normal  evolution  of  the  project  as  the  time  goes  by. 
During  early  stages,  it  is  normal  for  projects  being  in  the 


growing  region.  However,  if  the  project  continues  in  this 
region  after  many  cycles,  or  return  to  this  region  after  visit¬ 
ing  other  regions,  then  something  wrong  could  happen.  In 
the  first  case,  the  requirement  engineering  could  not  be 
efficient.  The  second  case  could  show  evidence  of  late  dis¬ 
covery  of  some  cluster  of  hidden  requirements.  After  some 
cycles,  the  project  should  leave  the  volatile  region.  If  the 
project  evolves  to  the  shrinking  region,  and  the  require¬ 
ments  engineering  is  working  right,  there  is  evidence  that 
the  customers  are  cutting  down  the  project.  This  can  be  the 
indicator  of  a  severe  cut  in  the  budget.  Finally,  any  involu¬ 
tion  to  a  previous  region  should  be  considered  as  evidence 
of  threats.  In  such  cases  a  detailed  analysis  is  required  to 
assess  the  causes  of  the  anomaly. 


2.2.  Metrics  for  Personnel 

In  order  to  measure  personnel  both  quantitative  and 
qualitative  metrics  are  required.  The  skill  match  between 
person  and  job  is  required  to  estimate  the  speed  in  process¬ 
ing  information  and  rate  of  exceptions.  On  the  quantitative 
side  we  propose  to  measure  the  number  of  people  and  the 
turnover.  This  last  one  pro\ides  information  about  the  ex¬ 
pected  productivity  losses  due  to  training,  learning  curves 
and  communications. 


2.3. Metrics  for  Complexity 

Complexity  has  a  direct  impact  on  quality  because  the 
likelihood  that  a  component  fails  is  directly  related  to  its 
complexity.  The  quality  of  the  product  can  only  be  deter¬ 
mined  at  the  end  of  the  prixress.  Hence,  it  is  important  to 
measure  the  complexity  as  predictor.  This  particularly  use- 
lul  in  real  time  systems,  which  present  special  difficulties 
in  terms  of  requirement  engineering.  Some  requirements 
are  difficult  for  the  user  to  provide  and  for  the  analysts  dif¬ 
ficult  to  determine.  The  best  way  to  discover  these  hidden 
requirements  is  via  prototyping.  Computer  Aided  Prototyp¬ 
ing  System  (CAPS)  [2]  is  a  CASE  tool  specially  suited  for 
this  task.  It  has  a  graphical  easy  to  understand  interface  and 
mapped  to  a  specification  language,  which  in  turns  gener¬ 
ates  Ada  code. 

The  prototyping  process  consists  of  prototype  construc¬ 
tion  and  modification  (evolution)  based  on  evolving  re¬ 
quirements  and  code  generation.  Both  construction  and 
modification  are  exploratory  activities  with  a  common  tar¬ 
get:  to  satisfy  multiple  users  with  different  and  often  con¬ 
flicting  points  of  view.  Requirement  engineering  is  a  con¬ 
sensus  driven  activity  in  which  mechanisms  for  conflict 
resolution  and  traceability  of  requirement  evolution  repre¬ 
sent  critical  success  factors. 
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Formal  specifications  arc  suitable  for  beins  analyzed  to 
compute  their  complexity.  In  the  case  of  CAPS,  the  tool 
generates  specifications  in  a  structured  language  called 
Prototyping  Specification  Design  Language  (PSDL).  PSDL 
code  has  the  following  tokens:  types,  operators,  data 
streams  and  constraints.  Types  arc  declarations  of  abstract 
data  types  required  for  the  system.  Operators  and  data 
streams  are  the  components  of  a  dataflow  graph.  Finally 
constraints  represent  the  real-time  constraints  that  the  sys¬ 
tem  must  support. 


We  define  two  complexity  metrics  for  PSDL:  Fine 
Granularity  Complexity  metric  (FGC).  and  Large  Granular¬ 
ity  Complexity  metric  (LGC).  The  reason  to  compute  dif¬ 
ferent  metrics  is  because  we  want  to  detect  two  classes  of 
threats.  Fir -t.  we  need  to  be  aware  of  operators  that  are  too 
complex.  High  complexity  on  one  operator  could  be  caused 
by  poor  deogn  and  possible  can  be  solved  bv  further  de¬ 
composition.  Second,  we  require  a  metric  to  compute  the 
total  complexity  of  the  system. 

FGC  expresses  the  complexity  of  each  operator  in  the 
system  and  is  a  function  or  the  fan-in  and  fan-out  data 
streams  related  to  the  operator. 

FGC  =  fan-in  +  fan-out  (Hq.  4) 

I-GC  expresses  the  complexity  of  the  system  as  a  func¬ 
tion  ol  the  number  of  operators  (Oi.  data  streams  (D)  and 
types  (T). 

LGC  =  O  +  D  +  T  (Eq.  5) 

We  examined  the  correlation  between  LGC  and  size  of  the 
specifications  and  the  code.  We  observed  a  very  strong  cor¬ 
relation  between  PSDL  lines  of  code  and  LGC  (R  =  0.996}. 
The  correlation  between  Ada  non-comment  lines  of  code  of 
the  projects  with  their  complexity  measured  using  LGC.  wc 
observe  a  strong  correlation  also  (R  =  0.S9S)  (Fig.  2).  Even 
if  CAPS  generates  part  of  the  Ada  code,  the  designer  can 
add  and  modify  the  generated  code  introducing  more  vari¬ 
ability.  The  following  graph  shows  the  correlation  observed 
for  the  same  set  of  projects.  The  size  of  the  project  in  thou¬ 
sands  of  non-comment  lines  of  code  can  be  estimated  as: 


KLOC  =  (32  LGC  +  150)/ 1000  (Eq.  6) 

3.  The  proposed  model 

From  the  point  ol  view  ot  software  engineering,  it  is 
necessary  to  create  the  methodology  to  solve  the  decision¬ 
making  process  during  the  early  stages  of  the  life  cycle, 
when  changes  can  be  done  with  less  impact  on  the  budget 
and  schedule.  The  most  significant  causes  of  software  pro¬ 
ject  lailures  are:  lack  ol  understanding  of  user's  needs,  ill 
dehned  scopes,  poor  management  of  project  changes, 
changes  in  the  chosen  technology,  changes  in  business 
needs,  unrealistic  deadlines,  user's  resistance,  loss  of  spon¬ 
sorship.  lack  ot  personnel  skills,  and  poor  management. 
From  those  pathologies,  we  conducted  causal  analysis  arriv¬ 
ing  to  the  three  risk  factors  that  we  will  discuss. 

We  propose  to  divide  risk  management  in  three  activities: 
risk  identification,  risk  assessment  and  risk  resolution.  Risk 
identification  is  the  set  ol  techniques  designed  to  alert  and 
identity  possible  threats.  Risk  assessment  is  the  quantitative 
analysis  ot  the  probabilities  and  impacts  of  the  identified 
threats.  Risk  resolution  is  the  application  of  resources  and 
eltort  to  avoid,  transfer,  prevent,  mitigate  or  assume  the 
risks. 

In  order  to  achieve  ri.-k  management,  an  organization  re¬ 
quires  a  minimum  level  of  maturity  that  can 'be  associated 
with  CMM  level  2  [8].  If  an  organization  is  not  able  to  col  - 
lct.t  metrics,  any  attempt  to  formally  identify  and  assess 
risks  is  impossible. 

3.1.  The  risk  major  components 

In  our  vision,  software  risks  could  be  controlled  if  we 
could  master  how  to  administrate  uncertainty,  complexity 
and  resources.  Transforming  the  unstructured  problem  of 
risk  assessment  leads  to  a  formal  method  able  to  be  trans¬ 
lated  into  an  algorithm.  In  order  to  structure  the  problem, 
we  proceeded  to  analyze  the  problem  decomposing  project 
risk  into  simpler  parts.  We  used  causal  analysis  to'find  the 
primitive  threat  (actors.  We  identified  three  major  factors: 
process  risk,  resource  risk  and  product  risk.  Each  of  these 
factors  introduces  risks  by  themselves  but  mainly  due  to  the 
interaction  between  them. 

Resource  risk,  is  affected  by  organizational,  operational, 
managerial  and  contractual  parameters  such  as  resources, 
outsourcing,  personnel,  time  and  budget  among  others.  The 
literature  is  abundant  in  this  area.  Various  approaches  use 
subjective  techniques  such  as  guidelines  and  checklists  [9], 
(I0J.  [1 1],  which  require  expert's  opinion  even  when  they 
could  be  supported  by  metrics. 
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Engineering  development  work  procedures  such  as  soft¬ 
ware  development,  planning,  quality  assurance,  and  con¬ 
figuration  management  cause  process  risk.  The  more  com¬ 
plex  a  process  is.  the  more  diflicult  it  is  to  manage,  and  the 
more  education,  training,  standards,  reviews,  and  commu¬ 
nication  are  required.  Consequently,  complexity  grows.  The 
software  process  complexity  has  been  partially  covered  by 
research  in  terms  of  subjective  assessments  about  maturity 
level  and  expertise  [9].  [10],  [11].  However,  wc  require  a 
more  precise  and  objective  method. 

Finally,  product  risk  is  related  to  the  final  characteris¬ 
tics  of  the  product,  its  complexity,  its  conformance  with 
specifications  and  requirements,  its  reliability  and  customer 
satisfaction.  The  product  introduces  its  own  risk  factors  in 
terms  of  quantitative  and  qualitative  attributes.  Wc  identi¬ 
fied  two  basic  produet-risk  factors:  requirement  stability, 
and  requirement  complexity.  Requirement  stability  is 
measurable  using  the  set  of  metrics  previously  described. 
Due  to  lack  of  structure  in  informal  requirements,  it  is  nec¬ 
essary  to  transform  them  into  specifications  in  order  to 
compute  complexity.  Other  product  characteristics  such  as 
reliability  and  maintainability  arc  not  of  interest  to  identify 
and  assess  risk  on  early  stages.  Reliability  can  he  measured 
only  after  completion  or  almost  completion.  Maintainabilitv 
can  be  measured  only  after  the  design  is  started.  Both 
measures  are  useful  to  control  the  project  in  future  phases. 
These  estimations  are  useful  in  order  to:  a)  identify  the 
trade-off  function  between  error  reduction  and  cost  of  error 
reduction,  b)  provide  quantitative  basis  for  accepting  or 
rejecting  software  during  functional  testing,  and  c)  provide 
quantitative  basis  for  deciding  whether  additional  testing  is 
warranted  based  on  the  cost  of  error  removal. 

The  pnvess  provides  the  description  of  its  environment 
and  the  theoretical  requirements  to  execute  it.  Conse¬ 
quently.  the  process  introduces  threats  due  to  its  require¬ 
ments  and  characteristics:  complexity,  technology  required, 
budget  required,  schedule  required,  and  personnel  skills 
required.  The  resources  represent  the  actual  allowances  in 
personnel,  tools,  budget  and  schedule.  They  impose  con¬ 
straints  that  could  not  match  the  process  requirements.  The 
productivity  is  consequence  of  the  matching  of  these  two 
facets  of  the  project. 

The  decomposition  created  by  causal  analysis  revealed:  a) 
a  method  to  identify  risks  by  comparing  the  degree  of  mis¬ 
matching  between  the  product  and  process  characteristics, 
against  the  resource  constraints:  and  b)  candidate  indicators 
to  be  used  in  an  estimation  model. 

3.2.  The  formulation 

We  can  consider  software  projects  as  experiments  where 
its  cost  and  schedule  are  the  output  measures.  We  know 


that  software  projects  tend  to  overrun  costs  and  schedule 
(this  fact  has  been  proved  by  research  and  industry).  There 
are  two  possible  ways  to  interpret  the  result  of  the  experi¬ 
ment.  One  hypothesis  is  that  this  behavior  is  abnormal,  and 
a  consequence  of  lack  of  process  maturity  (SEI/CMM  ap¬ 
proach).  Another  hypothesis  is  that  this  could  be  a  "false- 
abnormal  ”  behavior  assumed  abnormal  as  consequence  of 
inappropriate  measurements. 

How  do  wc  create  a  macro  model  that  considers  the  pre¬ 
vious  concerns  and  is  able  to  be  used  during  the  evolution¬ 
ary  prototyping  stages  of  the  process?  Our  hypothesis  is  that 
a  WcibuITs  family  distribution  can  model  each  of  the  evolu¬ 
tion  cycles.  Lets  discuss  the  meaning  of  each  of  the  vari¬ 
ables  in  the  function: 

X  is  the  random  variable  under  study.  In  our  case,  x  can  be 
interpreted  as  development  time. 

O.  is  a  shape  parameter..  It  reduces  the  variability  narrowing 
the  shape  of  the  pdf. 

[5  is  a  scale  parameter  that  stretches  or  compresses  the 
graph  in  the  x  direction. 

We  require  a  third  parameter  C{)  to  shift  the  curves  to  the 
right  as  consequence  of  system's  conceptual  complexity 
reflecting  learning/training  delays.  The  functions  for  the 
pdf  ami  cdf  are  then  respectively: 

f  0.  x  <  y 

fix:  ■{.  (/..  p)  =  •!  d:q.  7) 

U«/P">  (x  -y f  1  exp[-l(.\  -  yj/pn.  x  >  y 

f  0.  x  <  y 

F(x:y. «.  P)=  •!  iHq.  8i 

l  1  -  cxp[-l(x  -  y>  /  p]  “].  X  >  y 

The  development  life  cycle  can  be  visualized  a  succession 
of  prototyping  developments  with  increasing  functionality 
followed  by  a  final  optimization  that  produces  the  system. 
Each  of  these  phases  has  the  same  activity  pattern,  so  its 
reasonable  to  suppose  that  the  delivery  time  for  each  one 
has  a  probability  distribution  from  the  same  Weibull  family 
but  with  different  parameters. 

During  each  prototyping  cycle  a  certain  number  of  prob¬ 
lem  events  occur.  A  problem  event  is  an  effort-consuming 
situation  that  introduces  a  certain  amount  of  functional 
complexity  to  be  solved  (caused  by  a  new  requirement,  a 
change  on  a  requirement,  or  as  the  consequence  of  rework), 
and  a  certain  amount  of  information  exchange. 

We  suppose  that  the  occurrence  of  problem  events  in 
each  cycle  follows  a  Poisson  distribution  with  different 
mean  for  each  cycle.  So.  the  entire  development  life  cycle  is 
a  non-homogeneous  Poisson  process.  We  assumed  this  dis¬ 
tribution  because: 
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(a)  There  exists  a  certain  rate  of  occurrence  of  events. 

(b)  The  probability  of  more  than  one  event  occurring  in  a 
time  interval  depends  on  the  length  of  the  interval. 

(ci  The  number  of  events  during  one  time  interval  is  inde¬ 
pendent  of  the  number  received  prior  this  time  interval. 

4.  Validation 

Our  model  has  been  calibrated  and  validated  in  two 
wa>s:  a)  internal  consistency  proved  by  mathematics  and 
statistics:  and  b)  black  box  validation  by  comparina  its  out¬ 
puts  in  duration  and  effort  with  other  available  models. 
Figure  3  shows  a  comparison  of  duration  estimates  using 
COCOMO.  Putnam  and  this  model.  Our  model  gives  a 
conservative  estimation  for  projects  between  4000  and 
20000  LGC  ( 1 28  and  640  KLOC  of  Ada).  For  the  compari¬ 
son.  we  converted  from  LGC  to  Ada  lines  of  non-comment 
axle  using  (Eq.6).  and  then  we  applied  the  obtained  size  to 
COCOMO  and  Putnam's  model.  The  estimation  seems  to 
be  i.h>  optimistic  for  projects  smaller  than  2000  LGC  or 
greater  than  25000  LGC. in  month. 


COCOMO  Pk.!"*'"  '•■.\ogueira 


Figure  3:  Comparison  with  COCOMO  and 
Putnam  methods 

5.  Conclusions 

We  addressed  the  issue  of  human  dependency  in  risk  as¬ 
sessment  of  the  evolutionary  software  processes  incorporat¬ 
ing  an  automated  risk  assessment  method  integrated  with 
evolutionary  prototyping.  Our  approach  provides  a  way  to 
structure  and  automate  the  assessment  of  risk.  The  pro¬ 
posed  model  addresses  part  of  the  limitations  of  the  tradi¬ 
tional  estimation  methods.  Wc  are  calibrating  the  model 
using  simulations  with  ViteProject.  Software  development 
is  still  a  human  dependent  activity  requiring  lots  of  human 
communication,  and  without  appropriate  managerial  deci¬ 
sion  support  tools,  software  engineering  will  remain  in  its 
present  slate.  Wc  think  that  we  require  improving  our 
knowledge  about  the  internal  phenomenology  of  the  soft¬ 
ware  life  cycle.  It  is  in  the  human  aspects  of  the  software 


process  where  the  bottleneck  is  located  now.  Automated 
risk  assessment  tools  should  consider  these  aspects.  With¬ 
out  such  knowledge,  prototyping  issues  such  as  incomplete 
specifications,  system  complexity  and  development  time 
will  remain  unpredictable. 
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